Hitchhiker's Guide to AI, Software Architecture, and Everything Else

If you are a software engineer: DON'T PANIC! This blog is my place to beam thoughts on the universe of Artificial Intelligence and Software Architecture right to your screen. On my infinite mission to boldly go where (almost) no one has gone before I will provide in-depth coverage of architectural and AI topics, personal opinions, humor, philosophical discussions, interesting news and technology evaluations. (c) Prof. Dr. Michael Stal

Friday, December 29, 2006

Architect's Toolset - CRC Cards

Whenever I am involved in a software architecture design, CRC cards are one of my favourite tools for team discussions. CRC is not the abbreviation of Cyclic Redundancy Check in this context as you might assume. It rather stands for "Class, Responsibility, Collaboration".
I was always wondering why I still can find developers not familiar with this tool.

So, instead of telling you all the theory, let us start with an example. In the simplistic example above, I like to design a Web Shop. After the engineers defined the most important use cases and scenarios they are now in the middle of determining all core entities. For example, they are playing around with the "customer buys items from catalog" use case. Of course, using a UML tool could be a solution but did you ever see more than two people sitting in front of a screen working efficiently on a problem? This simply does not work. Here, CRC cards come into play. A CRC card is a simple sheet of paper containing three separate sections (see figure above):

The class name is the name of the component being described. In the example, we are introducing a Shopping Cart.
The responsibility section contains all responsibilitiesfor which this component is in charge such as adding or removing items in our example.
In the collaboration section we define on which other components our component depends, the so-called collaborators. In the example, the shopping cart must be able to co-operate with the product catalog.

All of this is simply drawn on a sheet of paper and then visibly put on a white board, poster, whatever is available. After a while the board will be filled up with CRC cards. In order to make the collaborations visible I often use a string for connecting the CRC card to its collaborator(s) using pins.

Because this is a flexible, paper-based solution it is easy to change CRC cards, remove them again, add new ones, add or remove responsibilities or collaborators. Everyone in the team is able to participate in the whole brain storming process.

Obviously, in an agile setting the CRC cards will be incomplete in the beginning and then keep growing over time.

Later on in the design process we might add additional information to the cards such as superclasses.

There is a lot more to say about CRC cards. Ward Cunningham , the "father" of CRC cards, and Kent Beck provided an excellent paper for OOPSLA 1989 that you should definitely read (http://c2.com/doc/oopsla89/paper.html).

Wednesday, December 27, 2006

Product Lines

Software Product Line Engineering is one of the most interesting areas in software architecture today. And it is increasingly important in software development, in particular for the development of standard software products and embedded systems - those that we ususally consider to be software-intensive. Now, you might ask, what the hell is a software product line. One of the commonly accepted comes from the Software Engineering Institute (CMU) which defines a software product line (SPL) as “a set of software-intensive systems that share a common, managed set of features satisfying the specific needs of a particular market segment or mission and that are developed from a common set of core assets in a prescribed way”. Got it? Let me introduce an example. Suppose, you are going to develop new IDEs (Integrated Development Environment). These new IDEs should come in different flavors, one edition for Ruby, one for Fortress, one for C#. How could we manage to develop such a series of editions? First approach: We develop all editions separately. This turns out to be a nightmare as we will need to set up a project for each language edition. But even worse, after a time, we will recognize that all of these projects will develop very similar modules, a GUI, a debugger, a source code browser, a project and configuration management subsystem, a refactoring toolset, an internationalization component, a plug-in manager, and so forth. Many of these core components are programming language independent. So why should we constantly violate the DRY (Don't Repeat Yourself) principle?
Ok, what is the second approach? In our second attempt, we start developing the Ruby IDE taking special consideration to implement all subsystems such as the GUI, the debugger, you name it, in a re-usable way. After the Ruby IDE is completed, we can re-use existing components for further IDEs. Basically, we got a set of independent lego units which we are just might compose in different ways. Obviously, this is a much better approach in terms of productivity through re-use, but it turns out to add other challenges. For example, we have to come up with different application architectures that use the existing components in a bottom-up way. All of these architecture require a lot of effort themselves. In addition, it turns out that some of the components developed for the Ruby IDE are not appropriate for the Fortress IDE. Thus, we need to refactor them to be more generally applicable. But after that work we need also to refactor the Ruby IDE to integrate these refactored components. Of course, we might choose to end up with different versions of components. This is exactly one of the reasons why some projects experience death by unsystematic re-use.
Ok, what is the third approach? In the third approach we set up a development team that first analyzes which concrete IDEs we are going to develop. The team will take all requirements and define an general architectural framework for all of these IDEs. It will come up with commonalities which are all assets that should be part of the general framework. E.g., components such as the project and configuration management subsystem that are basically identical no matter which programming language you need to support. The team will also define the variabilities , i.e., things that vary among different IDE editions. For example, the abstract syntax editor framework might be a commonality, but the different concrete syntax editor implementations will vary. As we expect users to be interested in adding their own components to the IDE (for example tools that support programming conventions), we integrate a plug-in manager. This is an example for a component that supports variation over time. So the general team will set up an architectural framework with explicit variation points as well as other commonly used assets. As we expect these assets and even the architecture framework to evolve, we need to set up an environment where common assets and their evolvement can be supported. By the way, assets in this context can be different kinds of things: implementation artifacts, documents, models, test plans, and so forth. The setup of such a project will be very expensive but the Return On Investment will rock. As soon as everything is in place, the Ruby IDE team will be able to systematically instantiate the architectural framework with all commonalities, and also bind the variabilities according to their own needs. They will save a lot of time because they don't have to come up with their own, new software architecture from scratch and they can even use existing assets. What we have defined here, is Software Product Line Engineering. All products instantiated from the product line are often called Software Program Families. I hope, you got the basic idea. It is like in other industries such as car manufacturing where the car design team comes up with a new series of cars, defines what parts all of the models could share, and also defines the variabilities. Obviously, a development team must carefully identify whether is worthwhile to set up product line engineering. It is probably a bad idea when you only have two different products.
So far, the teams that develop concrete products use the architectural framework and the common assets but the rest will be plain-vanilla software engineering. In most cases, there is a lot of space for automating many activities. Take a production line in a car manufacturing plant as an example. In Software Factories the idea is to define a systematic plan of how a product is configured or implemented using the common architectural framework and common assets. The Software Factory will then allow automating the process of developing a concrete product by leveraging model-driven software development. In order to make this experience less challenging, the idea is to come up with a domain-specific language (instead of forcing everyone to use UML) and provide a generator that includes all knowledge about the architectural frameworks, as well as about commonalities and variabilities. The generator will take a model as an input and generate (most parts of) the concrete product.
An issue we must consider in this context is how to deal with evolution and change of core assets It is unlikely the architectural framework or other assets will remain unchanged forever. Thus, a systematic approach is required to give feedback from product instantiations to the product line itself. This also implies, you should always have separate teams for the product line itself and for the products. Otherwise, organizational issues might me very challenging. In other words: even, if you provide a product line engineering approach, you might be doomed to fail due to organizational problems. Needless to say, that the development process is also significantly different which you should take into consideration.
An interesting issue is the differentiation between a platform and a product line. In many cases, the most important core asset of a product line will be a complete platform. Eclipse is a good example for this. In this case, the platform will be the critical part of your whole product line. If it suffers from design erosion (see my last blog posting), this will have an impact on ALL products. That's why systematic evolution of common assets is so essential. Note, that a software product line does not necessarily have to introduce a platform. This is only one extreme, where we have prefabricated implementation of the common architecture and other common subsystems and components. Another extreme might be an application framework. Or we might only have a set of loosely-coupled artifacts and a software architecture. Thus, there are lot of options to implement software product lines.
In future postings I will address architectural issues related to product lines in much more detail.

Friday, December 22, 2006

Design Erosion

Most software systems start with a clean and comprehensible software architecture. But then, developers are often forced to add or change the system under increasing time pressure. After a while the system is polluted with workarounds. The strategic core architecture becomes vague or even completely lost. Design erosion has caused a breakdown of the original software architecture. Is this the normal way of an architect's life or did we simply fail? In the waterfall model design erosion will be inevitably caused by architects trying to cover all requirements at once, which we call big bang approach. As you know, in a sufficiently large system you can never anticipate all requirements. Thus, you can never come up with a sound architecture in the first place. From my viewpoint, this problem illustrates why agile processes are not an alternative but a must. In order to cope with these challenges, we have to embrace change. Change will definitely happen. Hence, we as architects must open our software systems to allow tactical extensions. Using additional use cases or when priorities are changing, then even the strategic design might be subject to change. Refactoring will hence be an important tool for architects. As implementation and architectural soundness must go hand in hand, applications such as Sotograph will enable architects to make sure that the implementation does not violate architectural guidelines and conventions as well as quality aspects. But this is only the tip of the ice berg. Testing, organisation issues, and many other aspects play an important role. It is possible to prevent design erosion, if your process model, organization, methods, and tools are appropriate. In product lines this factors are even more important because of the significantly larger impact of any change. In some circumstances, however, requirement changes caused by new business models, technologies or desired features might be too far-reaching. In these cases, it is sometimes much more effective to throw away the old system and build a new one from scratch. "Throwing away" does not mean that you should throw away your well-proven best practices or experiences. Just ged rid of the old software system, not of your expertise. As a consequence, a good project will enforce the documentation of best practices (e.g., domain models or DSLs, patterns, guidelines). Design erosion is also a good example that all those comparisons of software engineering with other disciplines are often like comparing apples and oranges. For instance, when building a house no one would ever come up with proposals such as adding additional rooms or floors after construction is completed. But people tend to think software engineering should support exactly that kind of flexibility. That is also the reason why software engineering, especially designing a software architecture is a really tough task.

Sunday, December 10, 2006

Teaching the Architects

I published the last posting several weeks ago. But, of course, I have a good excuse: In the meantime, I gave a seminar on Software Architecture and participated in several Siemens-internal events. But now the time has come to continue with my architecture blog.
This time I'd like to address one specific issue. How can we effectively teach other persons everything they need to know about designing a high quality software architecture? For the sake of brevity, I won't cover what a software architect should know in this posting. In the seminars Frank (Buschmann) and I give, we mostly offer a mixture of Powerpoint presentations, discussions, and group exercises. In the presentation sections we also address things we've learned in projects. In the group exercise 3-5 attendees are then asked to design a small example application such as a Web Store or Chat Server. This setting works very nicely, but is far from being sufficient. Thus, another and additional approach is to teach and consult small project teams in real-world projects. In this scenario we start with a series of seminars and then participate in real-world projects as architecture consultants. However, this isn't sufficient too as we also need a kind of certification for software architects. Of course, small companies might not afford to spend all that time and budget for educating their staff in architecture concerns. But that is another topic.
I'd like to know what you think. I'd also like to know about your experiences. What did work for you and what didn't? How did you become a software architect? I am looking forward to all comments.

Thursday, November 09, 2006

Ultra Large Scale Systems (ULS)

In the last OOPSLA Linda Northrop gave a talk on Ultra-Large-Scale systems (ULS). ULS introduces a lot of unprecedented challenges. An ULS is basically a system of systems each of large scale itself. Consider the Internet as one possible example, especially when you combine the Internet with Pervasive Computing. There are a lot of issues to be taken into account for such system. One example is the human interaction which will definitely be part of those systems. Linda motivated the problem we face with such systems with the metaphor of building cities instead of building buildings as we do today. How can architects cope with such systems? From my viewpoint, the development off ULS systems will be driven by emergent behavior and a mixture of system/software architecture efforts. By emergent behavior I address the problem that it is simply impossible to anticipate all networked entitities, especially when we consider continuous change. Thus, we can't come up with a fixed architecture. We are entering completely new areas here. There is no chance to really foresee now all the implications of an ULS on software architecture. However, we will need to introduce software/system architecture principles and guidelines that should impact the architecture. In addition, we will need to come up with some core components such as repositories, persistence stores, other kinds of central services that are essential. Everything would be initially driven by typical workflows that should happen in those systems. Essentially, we will need to start simple. Use simple abstractions and principles that can be combined to more powerful structures. The human brain is an interesting example as its main constituents are rather "simple" while its overall behavior (i.e., emergent behavior) is incredibly complex. One should keep in mind that it took mother nature millions of years to develop those brains by constantly modifying, enhancing the design, even removing old design that didn't work, until today and beyond. That implies, that we won't get it right in the beginning. Thus, ULS development will definitely need experimenting and playing around with concepts and configurations.
Swarm intelligence was a sort of precursor in that direction. Define behavior of entities as well as their interactions locally and then let the system act globally. The killer application were ant populations.

Friday, November 03, 2006

Architect always implements

This night I had a dream. I was supposed to have a surgery. Eventually, the physician came to the room. I heard him whispering to a nurse, giving her a document. It was a list of instructions he had written down. Oh, no! The nurse was going to be the surgeon!
Sounds like a nightmare. But isn't it something at least some "software architects" should be familiar with? A group of architects is specifying a software architecture. Once finished they throw it over the fence to the developers, and dissapear. I know that the comparison is like comparing apples and oranges. However, there is some truth in it.
In my last tutorial on software architecture Scott Meyers asked me why I recommend the "architect always implements" principle. And that is exactly the point. The task of a software architect starts early in the process and spans maintenance and evolution. After the strategic core design is complete, architects help to refine this core to obtain the more concrete, tactical design. And software architects always need to supervise the implementation as well as decide for tools and technologies. Find more details in my former postings.
If you consider these tasks, a software architect is only capable of fulfilling these tasks when she/he gets some implementation practice. Another issue is "eat your own dog food". You as an architect should make sure your designs work as expected by helping with their implementation. That also gives you more credibility with respect to other project members. And the other point is: how could you possibly recommend tools and technologies without practical experience? Would you trust a software architect who did not implement for the last ten years and has no clue about UML, middleware, C++/Java/.NET, Eclipse, Visual Studio, databases, Subversion, Clearcase, Maven, ... ?
Does that mean you need to implement a lot? Definitely not. It rather implies, you should implement some smaller parts that are not on the critical path because you will still have a lot of other work such as communication with stakeholders, refactorings, meetings, refinements, you name it.
Of course, it also depends on project size. In a small project with only 5 people, you cannot afford that an architect does not implement. In a large scale project with maybe hundreds of people involved , it might not be feasible to implement anything, at least not for the lead architect. However, if you're a lead architect, you should keep yourself knowledgeable by applying tools, technologies in small toy projects, and by participating in design reviews and code reviews. Of course, a software architect as I described in a former posting, is not born as such but will start as an excellent developer and then may turn into a software architect after a couple of years. Thus, she/he should like implementing, anyway.
In the projects I've been involved "architect always implements" has proven to be an excellent principle. That's the reason why I am wholeheartedly recommending it.

Thursday, November 02, 2006

Architect's Project Diary

Have you ever been in a meeting where participants discussed a topic which already had been discussed several times before but no one was sure what the concrete decision had been. If yes, didn't you consider that recurring and endless discussions a total waste of your time? To be honest, this is something you will also experience in your family life. But isn't there something we can do to avoid these unnecessary discussions? As software development projects are filled with meetings, the propability of such discussions is tremendously higher which is a good force and motivation to think about countermeasures. What I consider extremely helpful, is a kind of project diary. Within a team, whenever you are deciding on important issues or on issues which you know might raise further discussions, you should document these decisions using a project diary. Mention who was involved in the decision process, what alternatives had beed discussed, which ones were ruled out and which ones accepted. Explain the "what" but also the "why" in this context. Obviously, it doesn't make sense to document everything with all details. Fundamental decisions, however, need appropriate documentation in your diary. You might structure your diary simply as a list of meeting minutes or structure it using topic areas, whatever fits your needs. The advantage of such a diary is that you can understand decisions even a long time later and explain to people why exactly that decision had been taken. Or the other way round: If your requirements or your project context have changed and you need to change previous decisions, using the diary you might get a clue which decisions must then be "refactored". Anyway, a diary is absolutely helpful, when done appropriately and without unnecessary details.

Thursday, October 26, 2006

Live from OOPSLA 2006 (continued)

I will be less verbose in this second part of my travel report as I am summing up almost the whole OOPSLA from my personal viewpoint (it is impossible for a single human person to attend all those interesting things in such a large conference). Thus, my impressions are a little bit biased as they only reflect my personal research interests. And I am a lazy writer :-) So don't expect me to give you every detail. You should definitely attend OOPSLA yourself next time.

I met a lot of various friends and people during the conference days (that is, as some say, the most important topic here). And as Terri Parr (conference chair) proposed I met a lot of new people. That was a really exciting experience. One of the funny stories is that I met a professor and it turned out that he was Axel Schreiner, one of the famous german computer scientists who left Germany and now lives in the US (I read his books on compilers and many other issues years ago during my university education).

Tuesday:
This is the official start of the main conference. It is the 20th OOPLSA aniversary and it is back there where it started: in Portland, Oregon. This year, OOPSLA is collocated with Dynamic Languages Symposium, GPCE, and PLoP. Terri Parr who is the conference chair announced that there are 1140 attendees, 430 from overseas with 460 first timers. Oregon locals: only 59! This proves that OOPSLA is a true international event.
The program comittee accepted 26 out of 157 submissions with 5-7 reviewers per paper.
2 essays out of 9 were accepted. In the Onward! track 10 submissions of 22 were accepted and 2 will appear in the proceedings. Terri told the audience in her greeting talk that Siteseer rates OOPSLA as one of the 50 most influential conferences.
The first keynote (Onward!) by Brenda Laurel (California College of The Arts) was on designed animism. Basically, it was a little bit esoteric but nonetheless interesting and inspiring. Brenda illustrated that every animal and plant has a "soal". She mapped all those ideas of animism to distributed and pervasive systems. And the basic driver for all of this is having fun. That's a rather short and incomplete explanation, I know. But thinking about it is left to you as an exercise :-)

In the language track (Research Papers) I heard a talk on how to dynamically extend typs by expanders. Instead of inheritance, source code change it is better to use adapters. With tool support this is a powerful feature.

Linda Northhrop (SEI) gave an invited talk on ULS: (Ultra Large Scale System). The bottom line was "Scale changes everything" in terms of software development. Take the Internet as an example with a decentralized, large number of users and data. Other applications include Healthcare Infrastructure, Homeland Security, Networked Automobiles, Military Systems (platforms, sensors, weapons, warfighters). ULS is a research study sponsered by DARPA. Goal: US Information Dominance.

Problem of such systems include continuous changes and sustainability. These systems are much more complex than what we have seen before. Heterogeneity is an additional issue. From the technical challenges, the US Army is concerned that complexity could be beyond reach (what about 1 billion line systems). That's why they initiated the study by the SEI Team and an Expert Panel. Process: Meet with experts in different fields at a 3 day meeting. Result was - as you might have expected from an expert group - a Research Agenda. They took a reductionism approach according to Linda.
First question they asked themselves: when do we encounter an ULS? Possible properties to look at could be:

size: in lines of code
number of connections
perception
number of processes, interactions
number of overlapping policy domains
number of people involved

Expect unprecedented scale of some of these properties in an ULS. ULS will be interdependent webs of systems, people, and other things. These are basically webs of systems each of them revealing internet scale (!!!).

Characteristics of an ULS system:

decentralized
inherently conficting, diverse, unknown requirements
continuous evolution and deployment
heterogeneous inconsistent changing elements
erosion of the people/system boundary
failures are normal
new paradigms of acquisitions and policy

Today's approach for large systems is to engineer top-down, controlled and use an agile perspective for small projects. For ULS a new perspective is required.

Metaphors to make that more obvious: today we build buildings, ULS is like building cities. Thus, guidance/rules are required. Another metaphor could be to think in ecosystems (independent systems, interactions, ...). ULS can be considered a socio-technical ecosystem with a lot of competition for resources and varying policies settings. In ULS we deal with ecosystems made of people, software, hardware, governments. Hence, engineering cannot be the metaphor. There are many conflicts with our approaches today. ULS == decentralized which is in conflict with today's centralized approach. Requirements in ULS are mostly unknown while today we believe that requirements can be known in advance and they do only change slowly. One of the additional tradeoffs will be stable continuous evolution: today we think of system improvements in intervals. In an ULS we face huge heterogeneity while in today's software systems we believe that effect can be predicted, configuration is accurate and in control. Social interaction is completely not considered today.

Challenges in three areas: design and evolution, orchestration and control, monitoring and assessment.

To investigate in ULS we need to establish interdisciplinary teams, learn from existing projects. Game theory might be a good point as well as the consideration of other existing networks (biology, body, ...) .

The ULS team came up with 7 areas of research

Human Interaction: "understanding users and their contexts"
Computational Emergence: digital evolution, metaheuristics in software engineering, algorithmic approaches in software engineering
Design: on all levels: software, rules, regulations, .... (the whole ecosystem)
Computational engineering: more expressive languages, other kinds of modularity, formal techniques
Adaptive System Infrastructure: decentralized configuration and evolution
Adaptable and predictable system quality: qualities - what do they mean in an ULS context
Policy, Acquisition, Management

Find the study as pdf in http://www.sei.cmu.edu/uls. A collaborative network is beginning to develop.

After the keynote people geve the feedback there was a little too much US military proximity, but that the topic is highly relevant.

There has been a Panel on the same issue with people such as Doug Schmidt, Gregor Kiczales, Linda Northrop. Here are only some excerpts from the panel to give you an impression.

Gregor, Doug: First we need to solve problems of small systems before moving to the ULS.
Kevin: Software as a problem must be solved in general.
Martin who is focusing on monitoring and repairing systems: uncoordinated growth require repair and monitoring support. Resources must be available to enable growth.
Ricardo: Let us play with such systems to find out before actually deploy them.
Peter: Lot of lessons we have not learned.
Neil Harrison (not a participant of the panel) cited Conway who said that organisations influence ULS. Look at your organization and that's how your systems will look like.
Peter: Deficiencies show lack of communication in teams. Conway's law is fundamental.
Doug: ULS is about heterogeneity. Can be challenging when parties involved are not really in favor of cooperating. Important: Building cities instead of buildings. Design process is important. Architecture consists more of rules. We need more abstractions.
Gregor: Ambiguous Meaning. Now we have not even a common understanding of most terminology.
An attendee from the audience asked about precursors of ULS. Doug thinks, current systems like military systems are precursors. A system that integrates all critical government systems could be a precursor. Linda mentioned, Internet+Wireless (devices) combined could be a precursor.
Another attendee asked about metaphors the team came up with? Cities, Ecosystem, Human interaction were the ones. The panel was a little bit inert and difficult as most participants are more technology biased. Sociology has many good points as Doug mentioned.

At Tuesday evening there was the most emotional part of OOSPLA. IBM invited to a dessert buffet and a memorial to the life of John Vlissided who died so early in late 2005. Desserts were John's favourites if you may wonder. Interestingly, the last time I had personally met him on the last day of OOPSLA 2004 in Vancouver we had talked about - cookies. If you don't know John: he was one of the Gang of Four authors of the seminal book on design patterns. Erich Gamma, Ralph Johnson, and Richard Helm (the other authors) talked in the evening event about their memories working on the book project with John. They also showed some video excerpts which really revealed to the audience what smart and nice guy John has been. My personal recommendation. Search for John Vlissides in Google and learn more about him.

Wednesday:
A Keynote on A Growable Language given by Guy L. Steele introduced Fortress. See http://research.sun.com/projects/plrg/ for more details. Fortress design goal was: to do for Fortran what Java did for C. Guy who was involved in Java design mentioned, there were many proposals to add scientific computing features which were rejected to keep the language small and portable. Thus, funded by US DARPA IPTO (High Productivity Computing Systems Program), Guy and his team started a new language design from scratch. Goal: availability by 2010 and beyond. Similar work is done by IBM (e.g. X10). Fortress offers accelerated development even for desktops (Multicore). Key idea of Fortress: - don't build a language - grow it. E.g., many people would like to add a lot of different primitive types. But no way! That's not the way Fortress will work. Whenever possible, Fortress' approach is to integrate a library instead. Hence, library designers need control over syntax and semantics (not just over method calls). Fortress offers a few primitives: binary words of different sizes, linear sequences, heap sequences, user defined parametrized types, user defined polymorphic operators. The compiler leverages aggressive type reference aggressive static and dynamic optimization. Libraries provide types such as lists, sets, matrices integers, floats ... with physical units. Data structures might be local or distributed. Fortress also intends to ease parallelism. In addition, it contains a programming notation that is much closer to Math. Instead of having to deal with Unicode characters, Fortress offers an ASCII (Wiki-like) notation. Example: A UNION {1,2,3,4} .
The strategy of the language specification team consists of studying existing applications, studying how libraries can improve coding, and adding language features. Replaceable components in the langauge system avoid a monolithic standard library and encourage change. Fortress tries to make abstraction efficient, The type system basically consists of objects and traits (like interfaces, may contain code). Multiple inheritance of code not fields is supported as well as multiple inheritance of contracts and tests (automated unit testing). Traits and methods may be parametrized. Primitive types are "first- class data and control models". To implement Fortress more flexibility is required than just compiler optimization. Things such as transactional memory (atomic blocks, ...) will be relevant. Libraries may define which operators have infix, prefix precedence, whether juxtaposition is meaningful, what operators mean.
Parallelism in Fortress was not a goal but a pragmatic compromise. Parallel Programming is still difficult and error-prone. The question the Fortress designers asked themselves: can we encapsulate it in libraries? They did it to some extent but also added some implementation. Loops are parallel by default in Fortress (library convention) .

Charles Simonyi presented Intentional Software Corp tool in Onward! that allows to combine different DSLs and use an IDE to have different perspectives on the same source code. Very interesting and inspiring talk. The tool is currently evaluated by Cap Gemini and ThoughtWorks. Basically it allows to you view your code as code and also have some other perspectives at the same time. For example, if you implement a state machine, state diagrams could be made visible in your IDE editor. This is an excellent idea that can not be described in a textual blog. Expect more from this cool company in the future.

Another invited talk was by Joshua Bloch (Google, formerly Sun) whom many of you may know as the author of the Effective Java book. His topic was about how to design a good API and why it matters?

APIs are important assets for companies but might also be a liability. The Process of API Design basically consists of

Gather Requirements with a healthy degree of skepticism
Start with a one page document
Write to your API early and often
Writing to SPI even more important
Maintain realistic expectations.

General Principles that are useful in this context:

APIs should do one thing and do it well (also important here: use comprehensive names)
API should be small as possible but no smaller.
Implementation shouldn't impact API. Minimize accessibility of everything.
Names matter - API is a (little) language.
Documentation matters- document religiously.
Consider performance consequences of API design decisions.
API hast to co-exist peacefully with the platform.

In terms of class design Joshua gave the following advice:

minimize mutability
subclass only when it makes sense
design and document for inheritance or else prohibit it

Method design is another point he mentioned:

don't make the client do nothing the module could do (otherwise lot of boilerplate code is the result)
don't violate the principle of least astonishment
fail fast - report errors as soon as possible after they occur
provide programmatic access to all data available in string form (e.g. stacktrace as one string vs. everything as set of elements)
overload with care
use appropriate parameter and return types
use consistent parameter ordering across methods
avoid long parameter lists
avoid return values that demand exceptional processing

Exception design is another important matter:

throw ecxceptions to indicate exceptional conditions
favor unchecked exceptions
capture information in exceptions.

On thursday I gave my two tutorials (Software Architecture - Strategies, Qualities, Principles as well as SOA from an Architectural Viewpoint). Both were well crowded and I got a lot of positive personal feedback after the tutorials. Neil Harrison and Scott Meyers both attended my Software Architecture tutorial and asked smart questions (as I had expected it). I will put PDFs of these tutorials on my Web site. Thus, I don't go into the details here. The most interesting event of the day was a fire alarm during my first tutorial. There was a power outeage in the part of Portland where the venue was located. Hence, I made a reading session out of my tutorial while the beamer didn't work. Guess, that's at least something the attendees will remember :-;

The next day I left back to Munich. I am really looking forward to the next year where OOPSLA will be located in Montreal.

Tuesday, October 24, 2006

Live from OOPSLA 2006

This is the third day of my OOPLSA/GPCE 2006 experience. This year, OOPSLA takes place in the Oregon Convention Center in Portland.
I spent half of the time talking to other people such as Rob van den Berg, Arno Zimmermann, Eric Mejier, Jimmy Nilsson, Neil Harrison, Peter Sommerlad, Doug Schmidt, Arno Haase, Markus Völter. Conferences are an ideal place for meeting other interesting people.
Yesterday, I attended a tutorial by Arno Haase and Markus Völter on Domain Specific Languages and Model-Based Development. As tool chain they used OpenArchitectureWare and Eclipse. As an running example Arno and Markus illustrated the domain of state machines. The tutorial was absolutely entertaining. Now, I know about all of the important buzzwords. Just kidding :-) Indeed, the main benefit of the tutorial was its pragmatic approach. Funny to see Markus torturing Eclipse to get the most out of it.
Today, I planned to see Niclas Nilsson in his tutorial how to write code generators. Unfortunately, the tutorial was already crowded with people. Tutorial speakers like me can attend other tutorials free of charge if there are some seats left. And that can be a challenge. As an alternative, I intended to participate in a tutorial on programming the Sun SPOTS robots. Guess, what? Thus, I eventually attended a tutorial by Jeff Garland and Richard Anthony, both experienced practitioners for building large scale systems. They talked about building solid distributed enterprise software architectures. In details, they focused how UML and architectural principles can help for this purpose. It was the intent of the tutorial to illustrate how to effectively use UML and architectural principles to design and document an architecture. I missed a little bit the architecture principles aspect. All in all, I can recommend the tutorial.
This evening, I will meet Eric Evans and Jimmy Nilsson for dinner. Before, there will be the Welcome Reception, the first event in a series of social events.

Wednesday, October 18, 2006

Active Web - formerly was Web 2.0

I am track chair of a track on Web 2.0 for the upcoming OOP conference (http://www.oopconference.com). In this context, I also agreed to give an introduction talk on what Web 2.0 is all about. Simple task, I thought in the beginning. Well, Web 2.0 is about social networking and AJAX and tag clouds and YouTube and Digg and podcasts, and ... , you name it. The name Web 2.0 is really confusing, isn't it. But, what does Web 2.0 really mean?
From my viewpoint, there is a big commonality among all those concepts and sites. Everything in Web 2.0 is about (inter)activity.
In AJAX and similar technologies - the web page is not passive anymore. Instead, it contains code fragments that actively pull/push information from/to backend servers to achieve a better user experience.
In social networks users are not passive consumers anymore, but actively connect with each other. To enable this, web sites must actively connect different people with each other.
In pages such as Flickr, YouTube, Digg, Amazon people actively share information and media. Podcasts and VideoPodcasts allow people to become active content providers. Same for P2P networks, Wikis, Blogs, ....
In other words, Web 2.0 should be better renamed to reflect the aforementioned observations. It is about the evolution of the Web from a passive medium where we find a small group of active content providers and a large group of passive content consumers to a new Web where the boundaries between content providers and content consumers more or less disappear. This new Web provides technologies and means for Web users to switch from a passive consumer to an active provider almost immediately. It is a Web where everyone can participate. It is no longer a technology platform dedicated to large companies for marketing purposes or establishing Web-shops. It is more about social interactions and active participation. It is an open medium for everyone.
My suggestion thus is: let us rename Web 2.0 to Active Web. I think, this name much better reflects what the new Web evolution is all about.

Sunday, September 10, 2006

Architectural Beauty

Whenever we are enjoying music, literature or paintings we associate some kind of beauty with these art works. Physicists and mathematicians consider some theories as incredibly beautiful and elegant, for example the theory of relativity or quantum physics. Architecture of buildings can also be beautiful. And what about software architects? Do we consider some software architectures as beautiful? Well, the answer is not really surprising. Of course, everyone of us from time to time encounters a software system, he or she feels very confortable with. Two questions immediately arise: Are there some objective properties a software architecture must or should reveal so that we consider it as beautiful? And if that is the case, is architectural beauty an important quality aspect or is it just a kind of preception without further value?
For me personally properties such as the following ones are important so that I consider a software architecture as beautiful and elegant:

Simplicity: A software architecture should be as simple as possible but obviously not simplistic. If an architecture reveals unnecessary complexity, it is almost impossible to capture the strategies and tactical design behind the architecture. This is strongly coupled with readability. To check whether a software architecture is simple, ask the chief architect to call a person who has never seen the architecture before. The architect should then in no more than 5 minutes explain the architecture to the said person. If the person can grasp the fundamental architectural idea within that time, then the architecture should be simple. Another related issue in this context is expressiveness. An architecture that is expressive implements a (domain) model consistently. Hence, it is very easy to find the key domain entities and use cases in the architecture. I am emphasizing the domain model, because some (ugly) software architectures tend to mix implementation issues with domain entities which makes them not expressive at all. Separation of concerns is one of the means to achieve expressiveness.

Orthogonality or conceptual integrity means that within a particular software architecture the same solution is applied to address the same problem (context). This is where patterns come in: using the same patterns for the same problem contexts is important. Likewise, it is essential not to unnecessarily reinvent the wheel again and again. This is also sometimes mentioned as the concept of least surprise. Of course, orthogonality is not constrained to patterns. For instance, a software architecure lacks orthogonality if you find different kinds of error handling strategies or memory management strategies all over the system. Note, that orthogonality is closely related to simplicity and expressiveness.

Correctness: Even, an expressive and orthogonal system might be plain wrong. If you like to get a spreadsheet application from a development team, you won't be very happy as a user if you receive a text processor instead. Correctness is not limited to functional aspects. A system that does the right thing but takes an incredible amount of time for each activity is also considered incorrect from a user perspective. Correctness might be improved by re-use. Re-using well-proven components or designs instead of inventing your own stuff is obviously helpful.

Symmetry is an important topic as well. There are two kinds of symmetry, structural and functional symmetry. Structural symmetry is tightly coupled with conceptual integrity (see above). Functional integrity means for eaxmple: when there is an open-method there should also be a close-method. Breaking of symmetry in this context may lead to incorrect systems. Kevlin Henney used functional symmetry to illustrate why he considered the GoF's factory pattern as incomplete. In the factory pattern there is a create-method that helps to hide all the complexitites of object creation from an object user.The factory pattern states that it is applicable in all situations where object creation is rather complex. Kevlin argued that when object creation is complex, then object deletion is also complex in most circumstances so that there should also be a delete method in the factory pattern.

Of course, this list is far from being complete. I just introduced some points to give you the idea.
There are also some signs when an architecture doesn't reveal architectural beauty. For example, violation of layering or dependency cycles in the design are always signs of severe problems or design erosion which brings me to another point:
Often, architects and developers come up with very beautiful architectural design in the beginning. After some change requests or extensions to the system which are often applied with time pressure, the software architecture erodes and with it the architectural beauty. This means, architectural beauty is not carved in stone but might disappear after a while, if you are not cautious.
All of these properties of software architecture that I consider as preconditions for architectural beauty are also qualities. They help to achieve a specific purpose. Thus, architectural beauty is not independent of architectural quality. Both are only two sides of the same coin.

Thursday, September 07, 2006

Live again!

I am back from vacation. When my vacation started weather turned from sunny to rainy. After my return to the office, the sun is back. Must be related to Murphy's Laws.
In my vacation I used the time to increase my knowledge. Well, almost the truth :-) In fact, I spent a lot of time for my sports activities such as running and biking. The rest of the time I was busy because I had to prepare some articles and talks.

I wrote an article on SCA (Service Component Architecture) and SDO (Service Data Objects). To dig into the details I used the Apache Tuscany M1 implementation for Java. My opinion: SCA and SDO are really cool technologies. They might not be perfect but go into the right direction. In detail, they address composition and modularization aspects of SOA systems as well as ESB issues.
In addition, I had to dig deeper into .NET 3.0: WCF, (W)WF, WPF, WCS. All these TLAs address really exciting technologies. I am absolutely interested how to combine WCF (Windows Communication Foundation) with SCA.
Another cool framework I used was Ruby on Rails. I've been a Ruby expert for a long time, but had only very little knowledge about Ruby on Rails. After using it, I am really impressed. This framework proves that a sound language design has a direct impact on what you can build. Same thing with Java and C#: maybe possible, but wouldn't feel natural to developers. It is amazing how fast Ruby on Rails applications can be built.
I had to prepare and organize the inevitable Web 2.0 track for the next OOP Conference. Web 2.0 is exciting as it combines technologies with advanced user experience. I got Markus Völter for the track and excellent speakers from Google and Microsoft. My talk will introduce the Web 2.0 space.
For the upcoming JAOO conference I promised a talk on Spring.NET. It is not as powerful and huge as Spring. Nonetheless, it offers a powerful dependency injection container supporting AOP stuff as well as APS.NET, .NET Remoting, Serviced Components, and more. Hopefully, .NET developers will soon recognize the power of IoC Containers which really offers a productivity boost and reduces dependencies on concrete technologies. A wizard for Visual Studio .NET would be great.

As you can easily see, I like playing and experimenting with all these new toys. It helps me in my job but is also great fun. Maybe, this was the only upside of the rainy weather in Munich.

Tuesday, August 08, 2006

Blind Spots

Did you ever encounter the following problem: Suppose, you have written a text document. There are some very obvious typos in it. But, no matter how often you read and check your text, you just can't detect those errors. They somehow resemble blind spots. Same for writing source code although the compiler will point you to all syntax errors.These blind spots are even harder to find if they appear on the semantic or logical level. E.g., you are absolutely sure that you should do a depth first search, while in fact a breadth first search is more appropriate such as in a chess program. What is the reason for these blind spots? I don't know whether there is a biological or psychological explanation. However, I assume that the brain due to its filtering capabilities let's you just ignore the problem because it doesn't focus on the details but on the whole picture.
What does this imply for software engineering? From my viewpoint, this is a great example why pair programming and the 4 eyes principle are important. And also why code and architecture views are so essential. Whenever I am writing a document or creating a design, I will always ask someone else to crosscheck. Blind spots are a principal problem of humans. Thus, it is no good idea to pretend they do not exist.

Monday, August 07, 2006

Hammer and Nail

It is probably human that people always try the same solutions and tools they have previously used, even in situations when this is not suitable. This often leads to the Hammer and Nail syndrom: Do you want to put a painting on the wall? Take a hammer and a nail. Do you want to connect two pieces with each other? Take the hammer and the nail. While this approach makes you look incredibly stupid in real life, it is often applied in software engineering. That's the reason why some projects are doomed to fail. Still not convinced? Make the following experiment: Go to a team of architects and developers who have used CORBA and C++ in all their previous projects. Tell them about a new project where a distributed system needs to be built. Guess, what technologies they will recommend? I was often involved in projects where I have heard that technology X will be a must. The responsible members didn't even know what the problem was they were going to solve but were sure that a specific technology should be part of the solution. That's exactly what I mean by Hammer and Nail syndrome in software engineering. Another example are those distributed systems where people were using synchronous RPC style communication even for event-based asynchronous applications such as network management or control systems. Here, synchronous RPCs are the last thing you should use. Note that these issues are not always obvious. For instance, James Gosling never stopped telling me that 90% of all enterprise Java projects used EJB even if there was no need for a component container. While having said all this, I must mention that the other extreme is the best-of-breed syndrome. People are dividing their problem space in a large number of sub-problems and chose for each sub-problem the best technology or tool available. This approach may lead to a nightmare as no one ever will be able to handle dozens of different tools and technologies, especially when it is difficult to combine them. Sometimes, it is better to resort to sub-optimal solutions instead, thus minimizing the number of technologies and tools.What I often do as a consultant when asked what technology to use: I will ask the stakeholders about a list of detailed and prioritized requirements. Then I will ask independent technology experts how they believe their favourite technolgy or tools is able to meet these requirements and also ask them to do the same for other technology options (as a crosscheck). Of course, detailed study of reports and articles is another possibility. Mostly, this decision matrix shows very clearly a technology that fits best to the problem context. If multiple options fit, I will ask management to make a decision (never let them you decide as a consultant because this will be like shooting in your own foot, anyway). If there are related problems and decisions to be made, build groups of solutions.Of course, the Hammer and Nail syndrome is not only about tools and technologies but also about software architecture. People tend to apply the same architectural principles again and again even if they are not suitable. Look at all these architectures where you'll find an observer or strategy almost everywhere. If architects or engineers detect a new toy, they want to play with it all the time. This problem can only be addressed by code and design reviews (or by pair-programming). If you face such a problem, tell the engineers exactly why using that kind of pattern or architectural solution isn't smart in that particulart context. For instance, the observer pattern makes no sense if there is a bi-directional 1:1 dependency between two components.The problem is that everyone of us (me too) may fall addicted to the Hammer and Nail syndrome from time to time. I often found that in some cases I had to come up with a quick solution so I used what I already knew. That is human, but limits creativity. Creativity also means to find new, innovative solutions instead of trying to apply the same old solutions again and again.

Sunday, August 06, 2006

Death by Flexibility

I remember a project where engineers were using CORBA as their favourite middleware solution. Those of you familiar with CORBA know that as in most other middleware solutions, CORBA provides generic data types such as (Dyn)Any (may dynamically represent any other data type) and unions (a set of alternative data types). Engineers in that particular project found these generic data types very useful. When they faced the problem of implementing a tree structure which should then be transmitted over the network, they came up with a solution that relied on any and unions. Once finished, performance behavior was incredibly bad. It turned out that the middleware must at run-time interpret and (de)marshal tree structures that consist of unions and anys. While being very flexible, the solution was useless due to performance penalties. In another project we saw engineers implementing a mediator component as the central part of their architecture. Every other subsystem depended on that mediator in the middle. Thus, the mediator became the centre of this application's universe. After a (short) while, design erosion started to badly impact the architecture. What the designers intended to create was a flexible solution, but what they got was a maintenance nightmare. Other projects often get addicted to the strategy symptom. Every functionality is hidden behind a strategy implementation. In a large system this flexible solution leads to a configuration problem as, after all, someone needs to configure all strategies during start-up time. Using strategies everywhere, basically means "I don't know now what kind of implementation should be used here. Thus, I leave that decision open for later generations of developers". But how should they know? Using a centralized or decentralized approach for finding peers is also a flexibility issue. A dynamic lookup-approach such as available in Jini or Peer-to-Peer solutions offers high flexibility, but also more network load at the same time.
So, what can we learn from such experiences? Over-flexibility is doomed to fail. For instance, in the first example with all the any and union types, it is much smarter to either use CORBA value types or a concrete constructed type that the middleware does not need to parse at run-time. In the "overuse of strategies" example, solutions would be to open only those places of the architecture for extension or change where it is really required (Open/Close principle). Overuse of strategies is often a sign for missing requirements, experience or knowledge. In the god-like mediator example, it seems as if the developers have forgotten to perform a use case / scenario analysis. By considering the relevant main workflows, the architecture dependencies often show up very soon. For places where workflows need to be flexible use solutions such as rules engines, observers, and/or dependency injection.
What we also see is that performance and flexibility are difficult to achieve at the same time. Very flexible solutions, often lead to performance penalties. Very performant solutions often get their performance boosts from directly accessing critical hardware and software layers which doesn't leave much space for flexibility.
However, don't take this as a general rule. For example, the BEA JRocket Java VM increases performance by flexibility. At start-up time, the VM detects what kind of environment it is running on and then automatically adapts its configuration accordingly such as choice of multithreading strategy or cache sizes. In other words, flexible patterns for resource management are very valuable, especially when adapted to their runtime environment as start-up or maintenance time.
To sum up, my main recommendation shortened to one single sentence is: Be as concrete as possible and as flexible as really necessary.

Sunday, July 16, 2006

My Home is My PODCASTle

I am one of those freaks who read IT literature even in their spare time. For me it is very relaxing to hear about new and interesting topics. As I love sports such as running or biking, I bought an IPod Nano and then an additional IPod Video (yes, I am running and biking large distances :-) which I both filled up with my favourite music. Shortly after buying the Nano I saw these things called Podcasts in ITMS (ITunes Music Store). I tried it and started to subscribe to different Podcasts. And now, guess what? Yes, I am listening to software engineering related podcasts while running or biking. And here it is, my current Top 5 list of podcasts:

Software Engineering Radio is produced by a bunch of german software engineers, among them Markus Völter. This is from professional engineers for professional engineers.

Ruby on Rails Podcast is for all those who develop Web-based software in Ruby. The experts you can hear are THE top experts in the field.

Security Now! with Leo Laporte and Stephen Gibson. If you are interested in every detail about security issues subscribe to this podcast. This podcast teaches all the important stuff. Absolutely entertaining.

This Week in Tech by Leo Laporte really rocks. It is NOT about software but about IT news in general. Very entertaining. Hear this if you are a real geek.

Scientific American. Normally not about Software Engineering, to be honest. But I must admit that science is another favourite of mine.

What about you? Do you also listen to podcasts? Any programming or architecture related podcast I forgot to mention? I'd be delighted to receive some pointers to other interesting stuff.

Tuesday, June 27, 2006

Aspects r' us

There is a lot of noise about Aspect-Oriented Software Development at the moment. On the other hand, I often hear from people, even from well-known experts that they are sceptical about this paradigm. The reason I hear is that the most cited "killer application" for aspect-orientation is logging and tracing and that even that is not really easily mappable to aspects. Sometimes, I am wondering whether the critics is against AOSD in general or rather applies to AspectJ. I have no doubt that AOSD addresses a valid point. Using OO architects and developers must decide for a specific one-dimensional model how they view their universe which is mostly influenced by the application domain. Unfortunately, the real world is multi-dimensional. In addition to the domain architects must address infastructural aspects and non-functional issues. As a consequence, there are always different perspectives which together shape a multi-dimensional universe. Consequently, the best concept seems to be the initial addressing and planing of all different perspectives and how they need to be combined. Second step is to model the different perspectives. Third step consists of combining these views. Personally, I prefer this concept which HyperJ introduced. Problem: it is often not easy to define the "combine"-operator as the views might even be interdependent. In tools such as AspectJ we basically use multiple two-dimensional views. The basic view is always the domain while the second dimension is determined by a set of aspects. This two-dimensional approach is much easier to use for real-life programming, while the multi-dimensional approach has clear advantages for architecture design.
A valid question in this context always is: should one use AspectJ as a kind of DSL in addition to a programming language such as Java or is it better to apply generative techniques such as MDSD to generate the solution from a DSL? Challenge: the runtime aspects of AspectJ have no counterpart in generative approaches. A possible approach then would be to use integrated DSLs to provide different views in a multi-dimensional approach. For example one DSL for the domain, one for security, one for fault-tolerance. From these descriptions a MDSD generator would then generate Java and AspectJ code. This would make aspects an implementation issue.
No matter how we view it, we always have to cope with multiple dimensions. Unfortunately, as already stated, the number of dimensions is proportional to the complexity of the problem space. Inherent complexity can not be removed (in contrast to accidental complexity). The implication therefore is: either the complexity arises at the OO level or in the architecture or in the DSLs and their combination as well as in the use of these DSLs. Thus, we have to deal with this complexity. Until now no one ever came up with a really complete and consistent solution. Actually, that is exactly the reason why it is so difficult to meet operational and developmental qualities in a software architecture. It is a question of complexity.
From my current viewpoint a combination of domain modeling, AOSD and MDSD seems to promising. But maybe, sometime in the future, someone can come up with the grand unified theory of software engineering.
My conclusion: even if you don't program aspect-oriented and even if you don't use aspects, you will have to deal with them anyway. Thus, tools such as AspectJ are important as thought-provoking mindset.
I am curious about your opinion!

Saturday, June 24, 2006

Variabilities

One of the issues software architects constantly encounter are variabilities. Variabilities are points in your architecture that may vary from implementation to implementation. Needless to say that one of the critical decisions within program families (i.e., when dealing with Product Line Engineering) is to determine all variabilities and commonalities. To illustrate the challenge let me introduce an example. A container-hosted component will need to communicate with a particular remote object. The target address of this remote object defines a variability across different instantiations of the application. When and how can this variability be resolved? Even for this simple example there are various choices:
Development time: The target address could be hard coded into the client code. Compile/Link time: The target address is separated into a different file which is compiled and linked to the component (e.g. a proxy generated using WSDL).
Deployment time: The target address is specified by a configuration file which the container parses. It then passes the target address to the component (or a proxy) using dependency injection upon instantiation of the component.
Maintenance time/Runtime: The proxy which the component used to access the remote object is implemented as a DLL or shared library. This DLL might be exchanged at runtime by the container either using hot deployment at runtime or when the operation of the applications is paused for maintenance.
Patterns such as Decorator, Proxy, Interceptor or Strategy are helpful to deal with these variablities. Programming languages also offer great support for variabilites when they provide concepts such as interfaces, polymorphism, generic types. It is important to mention that each of these different binding times of a variability reveals different implications. For example, runtime binding is very flexible but might lead to performance penalties. In other words, the more loose coupling is introduced between the implementation of a variability and the application, the more flexible the application might be, but also the more resource consumption might be involved.
Another important point in this context consists of the fact that variabilities might be dependent on other variabilities which can be described using feature modelling. For example, if we use remote object RO1 then we also need to access RO2. Thus, the binding time and order need to be determined in advance. But that is a topic for a future posting.

Thursday, June 22, 2006

The Arrival of Language Integrated DSLs

As I introduced it in my last posting about Lisp and LINQ, I'd like to discuss an increasingly important topic with a little bit more details. I spoke about Language Integrated DSLs (LIDs) which basically combines programming languages with DSLs. What does this mean? Let me introduce an example for motivation. As you know the basic problem of accessing a database from Java or C# is impedance mismatch. Either we choose a programming language perspective using an Object Relational Mapping thus being unable to leverage some strengths of the database system. Or we may choose a relational database perspective instead where we get all the power of relational algebra but don't integrate with the Java object model. You can see the same problem again and again, for example when dealing with XML and Java or C#. The idea of LIDs is to integrate a sublanguage to your programming language. For example Microsoft LINQ (Language Integrated Query) allows to use select statements within C# programs to access the database. This is integrated in the languages (C#, VB). Don't mix this with previous approaches that were just relying on preprocessing or adaptation such as SQLJ or JDBC. The LID LINQ is integrated into the programming language C#. Note, that this has some proximity to other concepts such as multi-paradigm programming (read Cope's book for details - Cope stands for Jim Coplien :-). It is also an approach that is heavily used in XML. where Schemas may be modularized and integrated. For example, SOAP and WSDL use XSD as basis for type definitions and declarations. To be honest meta annotations are also a kind of additional language on top of your programming language which means that the integration of languages might happen in different ways. Future programming languages could become extensible sets of core language features that might be integrated with(in) sub languages (LIDs). For example, a domain specific language could be part of your favourite programming language. Note, that this approach does not remove the need for higher level domain specific languages. These are still very important, as for example in Model-Driven Software Development) . From my viewpoint, AOP approaches are also candidates where LIDs might be helpful. An aspect or related set of aspects can be considered as a language. Tools such as AspectJ help to formalize the language and integrate it with Java. The former HyperJ was also heading into this direction. Another advantage of these kinds of modularized languages is the fact that you don't need to bloat languages or libraries any more to get all those important features into the programmer's toolset. Instead, configure your core language with all the sub languages you require for your concrete problem. Language Integration might become a powerful tool for the future. I am really interested what others think.

Saturday, June 17, 2006

Lisp again!

I just saw a nice posting on Gernot Starke's blog: http://it-and-more.blogspot.com/2006/05/little-more-on-lisp.html.
There he addresses how developers can learn Lisp. I like this posting because in recent research projects I've made the personal observation that people tend to constantly reinvent the wheel. A good example are developers that grew up with Java or C++ or C#. They are often astonished with new and cool language capabilities such as closures (Ruby), or lambda-expressions (LINQ). Why am I talking about this issue? The good ole languages such as Lisp, Prolog invented several of the features that are so exciting. For example, model-checking in model-baased software development is a good example where Prolog turns out to be a clear winner. And Lisp simply is the singularity where the big bang of dynamic languages originated. Knowing these languages is absolutely valuable. Know the idioms of these languages and you will benefit in your daily developer life. Eric Meijer said recently on a conference: "you should learn a new language each year". That's the right strategy :-)

Complexity and Software Architecture

Often when speaking to other people about software architecture the term "complexity" is typically mentioned at some point. And I have to admit, I am also using "complex" and "complexity" very often. Have you ever thought about what complexity really means? If you perform a Google Search with "define:complexity" you'll get some hits, most of them relating to wine or coffee blend and taste. Or is it "generally avoided as an overused and poorly defined word, except in specific systems" as suggested in http://ishi.lanl.gov/diversity/Glossary1_div.html.

At university we have learned that the complexity of algorithms is measured by the amount of "processing time" it takes to solve a given problem depending on the number of input values, e.g.:

O(c) means that to solve a specific problem always needs the same time independent of the input. Example: a constant function that always returns 42.
O(N) means that the processing time reveals liner increase with the input dimensions.
O(log N): time for searching an element in a sorted field.
O(N log N): time to sort an unsorted field of N values.

We know that if we can map a given problem in constant time to let's say a sorting problem, and vice versa, then the problem will also have O(N log N) complexity. However, this kind of complexity is not that significant for software architecture design, is it? At least, we start to assume that there might be differend kinds of complexity.

In the old latin language complexity was defined as the total set of possibilities and capabilities. Thus, we can draw the conclusion that a software architecture is complex if it reveals a large set of properties and capabilities.

An appropriate way to get a gut feeling for software architecture complexity is to ask the following question: What are typical implications when a software architecture IS complex?

Static Structure:

The architecture consists of a whole ocean of entities with lots of different relationships
It typically comprises insufficient or confusing abstractions
There is no clear separation of concerns. For instance, some entities are overloaded with various unrelated responsibilities, a point that is is often tightly related to the previous issue

Dynamics:

There are lot of possible workflows
The system contains many states and transitions

As you surely know the old saying is that software architecture and organization are only two sides of the same coin. Complexity of architecture may thus be caused by your organization:

No clear team responsibilities and role assignments
Insufficient or missing process
High level of political issues in daily work
Documentation-addiction
Missing doucumentation
Insufficient amount of time dedicated to architecture design
No supervision of architecture realization
Lack of adequate testing
No sufficient team education upfront

In summary, complexity in software architecture is mainly caused by missing or overused abstractions and inadequate separation of concerns in either static structure or dynamics. Or in other words, the RUP 4+1 view helps us to structure complexity in different (4+1) areas. In addition, inadequate processes, tools, education, and organizational issues inevitably cause software architecture complexity.

Good means to prevent complexity are manifold:

software patterns
usage of frameworks and containers
aspect-oriented programming if done right
model-based software development if done right
higher abstraction by introducing domains specific languages and domain modelling
metrics if applied right
usage of appropriate methods and tools
requirements traceability
...

All of these means help to obtain appropriate abstractions, and mapping of responsibilities to entities. Note, that humans are normally only capable of processing 8 entities at the same time. Hence, this point should be taken into account on each abstraction layer and for each architectural perspective.

But that's only my 2c.

I am wondering what your opinions are w.r.t. complexity? Any complexity definition that makes sense?

Saturday, June 10, 2006

Michael's Pattern Laws

Here are some laws I found in the last years as a software architect I'd like to share of you. Maybe, you could share your own insights.

Patterns are no surrogate for human intuition and creativity.
Overload of patterns in your software architecture implies overload of problems in your system. However, not using patterns where applicable may make your life extremely unpleasant.
If you just found that cool new pattern, think again before bothering the rest of us! (Remark: I also want to remind you of Brian Foote's famous words: " a pattern is an aggressive disregard of originality").
Patterns are your best friends if they are treated with friendliness in your architecture design.
It is easy to become a pattern author but it is surprisingly hard to write a good pattern description.
If grandma understands it, it is propably a good pattern description.
Patterns that can be easily formalized are no patterns.
Patterns and Agility? Patterns are about agility. Without patterns your software architecture tends to become overly complex and thus hard to maintain, change, or evolve.
The number of pattern books has significantly increased in the last years but that is not necesssarily a sign of good quality.
Patterns are dead, CORBA is dead. All technologies that turn from hype to pragmatic technologies are declared dead once upon a time. If a technology is considered that way, it is typically safe and valuable to use.
A pattern is no island. It reveals its true power when connected with other patterns to form complete landscapes.
You can classify patterns in infinite ways. For example in structural, behavioral or creational patterns such as GoF. Or you may partition the pattern space with respect to granularity, process phases or domain facets. I prefer to have only two classes of patterns, good ones and bad ones.
Time is money. Applying patterns saves time. Thus, patterns are money! Don't forget to tell that your managers.
Always add a real life example to your pattern descriptions because some non-software people in your projects won't see the value of patterns otherwise. Take management as an example.
If you are a good guy, apply patterns. If not, anti-patterns might be a more appropriate choice.
Patterns are not just collections of UML diagrams. I totally agree with Bertrand Meyer who once said "bubbles don't crash" and "all you need is code". Developers should memorize those sentences.
Applying the right patterns the right way is like paradise. Applying the wrong patterns or applying the right patterns wrongly, however, is like hell. Thus, make sure you know what you are doing here.
Sure, you got all pattern books. But that doesn't make you a pattern expert automatically.
As the pointy-haired boss always ephasizes in Dilbert "work smarter not harder". Applying patterns is generally considered smart.
Beware of Murphy's Laws when applying patterns.

Friday, May 26, 2006

Agility and the Borgs

I have been a trekkie for a long time. Maybe, because software engineers are more open minded for this kind of SciFi movies and series. In the EXPO-C conference in Karlskrona, Sweden, Jimmy Nilson and me were participating in some Open Space events after the conference day. If you wonder what an open space is about: the goal of an open space is to meet people and discuss interesting topics. For this purpose, everyone can add her/his topic to a poster hanging on a wall. Then for each of these topics people are asked to participate in meetings that take place for a fixed amount of time. It is allowed to leave a group and move to another one (if you weren't the one who came up with the topic). Works very fine to get input and feedback from smart people. In Karlskrona we discussed topics such as "when is simple too simple", "agility", "what does it need to edutacte architects", just to give you an impression. After one of those "agile" topics I was wondering how these agile processes apply to Star Trek. Strange combination of topics, you might think now. One of the points discussed at the conference was whether hierarchical structures are suitable for agile processes. Thus, I asked myself what kind of process paradigm the Borgs are using. As you know, the Borgs are known to be very rude, assimilating other creatures to integrate their knowledge and making them robots without any own will. The Borgs are controlled by a central instance - the queen - but are tightly integrated to a community of heavily communicating and interacting creatures. It seems as if the queen is in charge of providing central goals but Borgs are able to achieve these goals by any means they consider appropriate. For now, we recognize that Borgs use communication and interaction, share their knowledge with each other, can interoperate independently as groups but not as individuals, continuously share their knowledge and abilities, strive for common code ownership. It is not obvious from the series if the Borgs follow a BDUP approach or are able to refactor and refine their work in the typical case of evolving and changing requirement. However, their behavior suggests that they are capable of fastly adapting to changing contexts. Take their ability to modify the frequency of their protection shield as a prominent example. So, does the society of Borgs constitute an agile organization or do they follow a waterfall model or a kind of hybrid approach? From my guts feeling, I always thought that agility requires smart and autonomous developers who organize themselves in a P2P fashion within a project. Do hierarchies, central organization and agility denote concepts that can be combined or do they contradict themselves. Is there a dependency between organization and process paradigm? Are there any contexts in which agile processes are inappropriate. If agility means the ability to constantly adapt to changing environments and requirements, then the Borgs obviously denote an agile organization. Even in an agile process someone needs to set the goals, coordinate all activities, and control the process. Why are the Borgs so successful, anyway?

Saturday, May 06, 2006

The Problem with Metrics

From time to time I am asked what I think about evaluating software systems using metrics. The issue with most metrics is their close association to implementation artifacts. One of the wide-spread benchmarks is LOC. Measuring a software systems by lines of code does not make much sense in many cases. What does it mean that a subsystem contains a specific number lines of code? What about lines generated by model-driven software development tools, middleware generators, GUI builders, or IDEs? That does not mean, however, that LOCs are completely worthless. At least, they can give you hints where quality problems might lurk. For example, if a method contains several hundreds lines of code, then you definitely got a problem with inappropriate modularization. Another example is Cyclomatic Complexity (CC) introduced by McCabe. The CC of your system's runtime graph can be calculated by CC = E - N + P (see wikipedia link to CC). Here, E denotes the number of edges in the graph, N is the number of nodes, and P the number of connected components. According to McCabe a value of CC greater than 50 means your (part of the) system has too much complexity and reveals high risk. When applied to the Observer pattern with 50 observers, the CC will be larger than 50! Unfortunately, we all know that the Observer pattern is everything but complex and risky even when used with high numbers of observers. The problem here is that CC simply recounts all connections even if they are all of the same type. What does all that mean? My point here is that metrics are of limited value for a software architect. For each metrics used, a software architect should be aware of the strengths and limitations. Architecture quality and their lack can be evaluated better using other means. I will write about these qualities in a further posting. Of course, in the meantime I am very interested in what you think about metrics?

Wednesday, May 03, 2006

What is Software Architecture

Anyone involved in software development projects is used to the term "Software Architecture". If asked what software architecture really means, most people will throw some of the well-known definitions at you. Just give yourself a try and search for it in Google, Wikipedia, or whatever information source you prefer. You definitely will be overwhelmed by the number of hits. In http://www.sei.cmu.edu/architecture/definitions.html some definitions are available. It's definitely entertaining to read all these definitions which are all valid but only show a part of the whole picture. To be honest, all of these definitions are mostly useless. It is much more important to ask what software architecture is going to provide for a project. In other words: if software architecture is the answer, what exactly is the question? Basically, every software system has a software architecture. However, a software architecture might be worthless when it is built by accident. Let me give you an example. If I just put together some spaghetti code using a trial-and-error approach without ever thinking about software architecture, I'll get an implementation that propably will work. Unfortunately, I'll have no clue why the system is working and I will definitely have a very bad time when being asked to maintain, evolve, change, integrate or otherwise modify or understand the system. As a consequence, software architecture in a more narrow sense should be the result of systematic analysis and design. But even systematic analysis and design might lead to problems when just considering implementation aspects. I've seen a lot of systems that were realized with a specific middleware, operating system, or database system in mind. As soon as some of these system details changed, development teams were doomed to fail. Thus, software architecture should focus on more implementation independent aspects, but also stay in harmony with the underlying system infrastructure (a.k.a. system architecture). In a further step, we should ask ourselves what drives the software architecture design? Basically, all kinds of forces respectively requirements such as desired properties, constraints, commonalities, variabilities have an impact on the architecture. But you may ask, what if there are conflicting requirements, and how to deal with requirements when there are so many of them? All requirements should be prioritized. If there are conflicting requirements with the same priority, then you got a problem! In this case the problem must be resolved with all stakeholders. That brings me to a further perspective on software architecture: First of all, software architecture is about communication of explicit and documented design decisions. This is the reason why I consider domain modeling so important. In addition, software architecture is not only about structures, relationships, dynamics but also about the process used to define all these structures and relationships as well as their interactions. In this context, it proves to be particularly important to differentiate between the strategic baseline architecture and tactical design. The core of the architecture should implement all important strategic requirements. Tactical refinements must not modify strategic decisions. Thus, designing software architecture is mainly a top-down approach, but may also use bottom-up integration of legacy systems or components. As already mentioned, requirements are the drivers for software architecture design. In the process, requirements depending on their priority need to be realized in the software architecture in such a way that requirement traceability is easily possible. Without traceability design erosion will destabilize the whole system, requiring additional efforts for refactoring or architecture recovery. Thus, software architecture mainly deals how to map requirements to appropriate structures. Here, granularity aspects are important. Have you ever tried to build a house by assembling atoms? That does not make sense as complexity cannot be mastered using a fine-grained approach. Therefore, software architecture is about mastering complexity by introducing appropriate abstractions. Patterns are a good example for such "abstract" architecture entities. As some requirements denote cross-cutting concerns, the right structuring of a software system under development is not easy. Consequently, software architects must introduce different perspectives. These different perspectives must be combined and converged to a unified and consistent software architecture. Therefore, software architecture is also about perspectives and the combination of these perspectives, which, however, turns out to be one of the most complex tasks in any software development project, because of emergent behavior. Of couse, this posting could only show the tip of the iceberg. I think, it is obvious why no single definition of "software architecture" can ever satisfy anyone. Nor is it really important to have such a definition. It is much more important to have a common understanding of the benefits of systematic software architecture design.

Thursday, April 27, 2006

Events

Every year a lot of interesting conferences and events on software engineering take place world-wide, especially on Object-Oriented Programming and related topics. Unfortunately, there are too many events. It is impossible to attend even the most exciting ones. The big ones like JAX, OOP, OOPSLA or JAOO are cool because they offer conference programmes where everyone should be able to find her or his favourite topics covered by some superb experts. And, you can meet so many people there. Definitely, a good opportunity to strengthen your social network. On the other hand, big conferences tend to be more formalized, organized, and commercialized. In the last few years, I increasingly enjoy attending that small events which offer excellent topics and speakers, somewhere in small places. That's one of the reasons why I am really sorry that the TOOLS conferences have been cancelled a few years ago. These events organized by Bertrand Meyer were absolute highlights. Fortunately, there are other small events that I would recommend. One of them is coming really soon and I will be personally involved as a speaker. It is the EXPO-C conference in Karlskrona, Sweden, to be more concrete. I will give a full day tutorial on Software Architecture and also speak on SOA. But that's not the reason why I am blogging about this event. They really got exciting speakers, talking about things such as Ruby, C#, Haskell, Software Architecture. This is more an event on languages in a broad sense, as the organizers explain. And of course, it is an event in Sweden. I have never been there, but I guess you'll meet the same friendly and smart people there as in Denmark, Norway, or Finland. If you got curious, here is your chance. You should definitely take a look at:

http://www.expo-c.se

Wednesday, April 26, 2006

Send all Market Analysts to a lonely Island!

This time, I like to discuss something absolutely different. A few weeks ago one of those famous market research companies - I won't tell you the name - announced that they have investigated interesting facts on the relevance of Podcasts. The result was amazing: only some weird techies are listening to Podcasts, while the vast majority of people is not interested in this Podcast thing. As a consequence, there is no business opportunity in this area according to the infinite wisdom of market analysts. A few days later, a more serious analysis found out that the the irrelevance of Podcasts was absolutely untrue. In the US alone millions of people are frequently enjoying Podcasting. I would have surely guessed that, as Podcasts are a kind offline radio and cover as many topics as one could propably think of. This reminds me of an Ovum report several years ago where they compared all these cool Remoting Middleware technologies (RMI, DCOM, CORBA). They forecasted that DCOM will be the dominant technology in the future, which is ... now. BTW, can you remember this DCOM technology ? DCOM propably stands for Dead COM. Can you remember the prediction that .NET and Java will both share 50% of the market. And all these predictions that SOA will make everything else obsolete? All of this could be quite entertaining. Unfortunately, in software development projects I constantly get pointers from managers that refer to these market analysis reports. Even software developers and architects believe in this crap. I understand that people are searching for help when faced with uncertainty. In market and technology reports you can find all these technology evaluations, product comparisons, and market forecasts. And, even worse, for almost all predictions in these reports you are able to discover additional reports, that tell exactly the opposite, at least, if you are digging long enough. Reminds me of Winston Churchill who once mentioned that he only believes his own wrong statistics. Sometimes I ask myself, if someone in this universe has ever tried to investigate the value of market reports. How many predictions turned out to be true or at least almost true? The problem of all future forecasts is that you can only base your statements on a small amount of facts and a large amount of opinions. Opinions are just that. They contain personal preferences, try to project past developments to the future, and use linear models. The reality, unfortunately, is often disruptive and non-linear. New technology developments appear, some of these developments become hypes, while others disappear and may re-appear some years later. What can be done in a stage of uncertainty? Base all your architecture and technology decisions on two things, the facts you know and the risks you must address. If necessary, hide technologies which might be subject to change, using adapters or other means. Everything is better than gambling. And, even if it sounds over critical, using marekt and technology reports is like gambling. Why do you think, these guys have become market analysts and not software engineers?

Thursday, April 06, 2006

SOA is NOT about Web Services

Recently, I have been involved in a couple of discussions where people complained about the SOA Hype and all those enthusiastic XML Web services celebrations. I wrote an article on SOA for the IEEE Software Issue on Future Trends of Software Architecture which you might read on my web site. Here are my two cents.

XML Web services define a kind of meta middleware dedicated to integration problems. If you need to communicate between a Java EE application and a .NET application then XML Web services represent one possible option. There are other alternatives in this scenario such as CORBA or RMI adapters. However, in most situations when you have to cope with heterogeneous systems without control which applications will have to be integrated, then XML Web services are the clear choice.
Service-Oriented Architecture as the name implies is not dependent on a specific implementation technology such as XML Web services. Likewise, Object-Oriented Programming is not dependent on Java. Instead, I consider SOA as a set of architectural principles that emphasize loose coupling. I will explain this point in the remainder of this posting.

A typical example for a SOA-compliant technology is regular e-mail. I'd like to present some of the properties of SOA systems using this simple show case.

SOA requires explicit, role-based interfaces for services. Clients do only see interfaces and never implementations. As the bridge pattern is applied, client and server interfaces are implementation-agnostic with respect to each other. Communication between client and interface is provided by standardized protocols.

Applied to e-mail: Mail clients communicate with mail servers using POP3, IMAP or SMTP. As there are no implementation dependencies, a mail client does not need to care about the implementation of the mail server, and vice versa. Thus, you are completely free to use any mail client and mail server if your communication peers abide to the standard protocols and standard interfaces. e-mail is a little bit special in that the services and their semantics are predefined.

SOA communication relies on asynchronous message exchange albeit there might be transparency layers on top providing remote method invocation based communication. Messages are in general dynamically routed, possibly passing through indirection layers.

Applied to e-mail: mails are passed as messages from their origin (client's outbox) to their destination (recipient's inbox) passing different intermediate layers. That means for example that you are not blocked even if your communication peer hasn't yet received your mail.

SOA messages denote standardized documents with different sections such as body or headers. Multiple messages may be used to transmit data in chunks, if necessary. The header usually contains data related to cross-cutting concerns such as routing, attachments, security. The body content may be predefined between the communicating peers or be totally applications-specific.

Applied to e-mail: mails are sent using mail header and mail body. All content must comply with MIME types. That is the only constraint but is not really a constraint in fact.

SOA message exchange patterns relate different messages with each other to introduce additional communication styles such as request/response or oneway. Another example is the return of fault messages upon error situations.

Applied to e-mail: requiring a receiver to send an acknowledgement is the typical example. And what about failure scenarios? When a receiver is not recognized an error message is returned to the sender.

In SOA systems business processes are first class entities. Business processes introduce domain specific languages to combine distributed services to whole workflows. Note that business processes are not just sequential invocations of services. Instead, properties might be applied to complete workflows such as transaction contexts or other kinds of coordination.

Applied to e-mail: this is not directly supported. Instead, such workflows are implemented by applications or human interaction.

This concludes my discussion on e-mail as a SOA implementation example. Another example for a SOA based implementation technology is Messaging middleware such as MSMQ or MQSeries.

Loose coupling is the central mantra of all SOA architecture principles. Don't let vendors or press media fool you. SOA is NOT about XML Web services. Instead, it denotes an architectural paradigm for distributed computing that can be implemented using different technology options. Does it solve all problems? Ideally, it is applicable to all problem domains with inherent loose coupling of distributed entities. But it is counterproductive to apply the SOA paradigm in problem contexts where tight coupling is mandatory such as in several embedded or realtime systems. One could argue that in the end every distribution middleware relies in its bottom layers on SOA based communication such as TCP/IP. But that argument resembles the argumentation that we could implement all our software systems using machine code.

Bottom line: Always use the abstraction layer that helps building your system effectively and efficiently. Don't accept software development projects to be influenced by inadequate personal technology preferences or political decisions.