Hitchhiker's Guide to AI, Software Architecture, and Everything Else

If you are a software engineer: DON'T PANIC! This blog is my place to beam thoughts on the universe of Artificial Intelligence and Software Architecture right to your screen. On my infinite mission to boldly go where (almost) no one has gone before I will provide in-depth coverage of architectural and AI topics, personal opinions, humor, philosophical discussions, interesting news and technology evaluations. (c) Prof. Dr. Michael Stal

Tuesday, June 27, 2006

Aspects r' us

There is a lot of noise about Aspect-Oriented Software Development at the moment. On the other hand, I often hear from people, even from well-known experts that they are sceptical about this paradigm. The reason I hear is that the most cited "killer application" for aspect-orientation is logging and tracing and that even that is not really easily mappable to aspects. Sometimes, I am wondering whether the critics is against AOSD in general or rather applies to AspectJ. I have no doubt that AOSD addresses a valid point. Using OO architects and developers must decide for a specific one-dimensional model how they view their universe which is mostly influenced by the application domain. Unfortunately, the real world is multi-dimensional. In addition to the domain architects must address infastructural aspects and non-functional issues. As a consequence, there are always different perspectives which together shape a multi-dimensional universe. Consequently, the best concept seems to be the initial addressing and planing of all different perspectives and how they need to be combined. Second step is to model the different perspectives. Third step consists of combining these views. Personally, I prefer this concept which HyperJ introduced. Problem: it is often not easy to define the "combine"-operator as the views might even be interdependent. In tools such as AspectJ we basically use multiple two-dimensional views. The basic view is always the domain while the second dimension is determined by a set of aspects. This two-dimensional approach is much easier to use for real-life programming, while the multi-dimensional approach has clear advantages for architecture design.
A valid question in this context always is: should one use AspectJ as a kind of DSL in addition to a programming language such as Java or is it better to apply generative techniques such as MDSD to generate the solution from a DSL? Challenge: the runtime aspects of AspectJ have no counterpart in generative approaches. A possible approach then would be to use integrated DSLs to provide different views in a multi-dimensional approach. For example one DSL for the domain, one for security, one for fault-tolerance. From these descriptions a MDSD generator would then generate Java and AspectJ code. This would make aspects an implementation issue.
No matter how we view it, we always have to cope with multiple dimensions. Unfortunately, as already stated, the number of dimensions is proportional to the complexity of the problem space. Inherent complexity can not be removed (in contrast to accidental complexity). The implication therefore is: either the complexity arises at the OO level or in the architecture or in the DSLs and their combination as well as in the use of these DSLs. Thus, we have to deal with this complexity. Until now no one ever came up with a really complete and consistent solution. Actually, that is exactly the reason why it is so difficult to meet operational and developmental qualities in a software architecture. It is a question of complexity.
From my current viewpoint a combination of domain modeling, AOSD and MDSD seems to promising. But maybe, sometime in the future, someone can come up with the grand unified theory of software engineering.
My conclusion: even if you don't program aspect-oriented and even if you don't use aspects, you will have to deal with them anyway. Thus, tools such as AspectJ are important as thought-provoking mindset.
I am curious about your opinion!

Saturday, June 24, 2006

Variabilities

One of the issues software architects constantly encounter are variabilities. Variabilities are points in your architecture that may vary from implementation to implementation. Needless to say that one of the critical decisions within program families (i.e., when dealing with Product Line Engineering) is to determine all variabilities and commonalities. To illustrate the challenge let me introduce an example. A container-hosted component will need to communicate with a particular remote object. The target address of this remote object defines a variability across different instantiations of the application. When and how can this variability be resolved? Even for this simple example there are various choices:
Development time: The target address could be hard coded into the client code. Compile/Link time: The target address is separated into a different file which is compiled and linked to the component (e.g. a proxy generated using WSDL).
Deployment time: The target address is specified by a configuration file which the container parses. It then passes the target address to the component (or a proxy) using dependency injection upon instantiation of the component.
Maintenance time/Runtime: The proxy which the component used to access the remote object is implemented as a DLL or shared library. This DLL might be exchanged at runtime by the container either using hot deployment at runtime or when the operation of the applications is paused for maintenance.
Patterns such as Decorator, Proxy, Interceptor or Strategy are helpful to deal with these variablities. Programming languages also offer great support for variabilites when they provide concepts such as interfaces, polymorphism, generic types. It is important to mention that each of these different binding times of a variability reveals different implications. For example, runtime binding is very flexible but might lead to performance penalties. In other words, the more loose coupling is introduced between the implementation of a variability and the application, the more flexible the application might be, but also the more resource consumption might be involved.
Another important point in this context consists of the fact that variabilities might be dependent on other variabilities which can be described using feature modelling. For example, if we use remote object RO1 then we also need to access RO2. Thus, the binding time and order need to be determined in advance. But that is a topic for a future posting.

Thursday, June 22, 2006

The Arrival of Language Integrated DSLs

As I introduced it in my last posting about Lisp and LINQ, I'd like to discuss an increasingly important topic with a little bit more details. I spoke about Language Integrated DSLs (LIDs) which basically combines programming languages with DSLs. What does this mean? Let me introduce an example for motivation. As you know the basic problem of accessing a database from Java or C# is impedance mismatch. Either we choose a programming language perspective using an Object Relational Mapping thus being unable to leverage some strengths of the database system. Or we may choose a relational database perspective instead where we get all the power of relational algebra but don't integrate with the Java object model. You can see the same problem again and again, for example when dealing with XML and Java or C#. The idea of LIDs is to integrate a sublanguage to your programming language. For example Microsoft LINQ (Language Integrated Query) allows to use select statements within C# programs to access the database. This is integrated in the languages (C#, VB). Don't mix this with previous approaches that were just relying on preprocessing or adaptation such as SQLJ or JDBC. The LID LINQ is integrated into the programming language C#. Note, that this has some proximity to other concepts such as multi-paradigm programming (read Cope's book for details - Cope stands for Jim Coplien :-). It is also an approach that is heavily used in XML. where Schemas may be modularized and integrated. For example, SOAP and WSDL use XSD as basis for type definitions and declarations. To be honest meta annotations are also a kind of additional language on top of your programming language which means that the integration of languages might happen in different ways. Future programming languages could become extensible sets of core language features that might be integrated with(in) sub languages (LIDs). For example, a domain specific language could be part of your favourite programming language. Note, that this approach does not remove the need for higher level domain specific languages. These are still very important, as for example in Model-Driven Software Development) . From my viewpoint, AOP approaches are also candidates where LIDs might be helpful. An aspect or related set of aspects can be considered as a language. Tools such as AspectJ help to formalize the language and integrate it with Java. The former HyperJ was also heading into this direction. Another advantage of these kinds of modularized languages is the fact that you don't need to bloat languages or libraries any more to get all those important features into the programmer's toolset. Instead, configure your core language with all the sub languages you require for your concrete problem. Language Integration might become a powerful tool for the future. I am really interested what others think.

Saturday, June 17, 2006

Lisp again!

I just saw a nice posting on Gernot Starke's blog: http://it-and-more.blogspot.com/2006/05/little-more-on-lisp.html.
There he addresses how developers can learn Lisp. I like this posting because in recent research projects I've made the personal observation that people tend to constantly reinvent the wheel. A good example are developers that grew up with Java or C++ or C#. They are often astonished with new and cool language capabilities such as closures (Ruby), or lambda-expressions (LINQ). Why am I talking about this issue? The good ole languages such as Lisp, Prolog invented several of the features that are so exciting. For example, model-checking in model-baased software development is a good example where Prolog turns out to be a clear winner. And Lisp simply is the singularity where the big bang of dynamic languages originated. Knowing these languages is absolutely valuable. Know the idioms of these languages and you will benefit in your daily developer life. Eric Meijer said recently on a conference: "you should learn a new language each year". That's the right strategy :-)

Complexity and Software Architecture

Often when speaking to other people about software architecture the term "complexity" is typically mentioned at some point. And I have to admit, I am also using "complex" and "complexity" very often. Have you ever thought about what complexity really means? If you perform a Google Search with "define:complexity" you'll get some hits, most of them relating to wine or coffee blend and taste. Or is it "generally avoided as an overused and poorly defined word, except in specific systems" as suggested in http://ishi.lanl.gov/diversity/Glossary1_div.html.

At university we have learned that the complexity of algorithms is measured by the amount of "processing time" it takes to solve a given problem depending on the number of input values, e.g.:

O(c) means that to solve a specific problem always needs the same time independent of the input. Example: a constant function that always returns 42.
O(N) means that the processing time reveals liner increase with the input dimensions.
O(log N): time for searching an element in a sorted field.
O(N log N): time to sort an unsorted field of N values.

We know that if we can map a given problem in constant time to let's say a sorting problem, and vice versa, then the problem will also have O(N log N) complexity. However, this kind of complexity is not that significant for software architecture design, is it? At least, we start to assume that there might be differend kinds of complexity.

In the old latin language complexity was defined as the total set of possibilities and capabilities. Thus, we can draw the conclusion that a software architecture is complex if it reveals a large set of properties and capabilities.

An appropriate way to get a gut feeling for software architecture complexity is to ask the following question: What are typical implications when a software architecture IS complex?

Static Structure:

The architecture consists of a whole ocean of entities with lots of different relationships
It typically comprises insufficient or confusing abstractions
There is no clear separation of concerns. For instance, some entities are overloaded with various unrelated responsibilities, a point that is is often tightly related to the previous issue

Dynamics:

There are lot of possible workflows
The system contains many states and transitions

As you surely know the old saying is that software architecture and organization are only two sides of the same coin. Complexity of architecture may thus be caused by your organization:

No clear team responsibilities and role assignments
Insufficient or missing process
High level of political issues in daily work
Documentation-addiction
Missing doucumentation
Insufficient amount of time dedicated to architecture design
No supervision of architecture realization
Lack of adequate testing
No sufficient team education upfront

In summary, complexity in software architecture is mainly caused by missing or overused abstractions and inadequate separation of concerns in either static structure or dynamics. Or in other words, the RUP 4+1 view helps us to structure complexity in different (4+1) areas. In addition, inadequate processes, tools, education, and organizational issues inevitably cause software architecture complexity.

Good means to prevent complexity are manifold:

software patterns
usage of frameworks and containers
aspect-oriented programming if done right
model-based software development if done right
higher abstraction by introducing domains specific languages and domain modelling
metrics if applied right
usage of appropriate methods and tools
requirements traceability
...

All of these means help to obtain appropriate abstractions, and mapping of responsibilities to entities. Note, that humans are normally only capable of processing 8 entities at the same time. Hence, this point should be taken into account on each abstraction layer and for each architectural perspective.

But that's only my 2c.

I am wondering what your opinions are w.r.t. complexity? Any complexity definition that makes sense?

Saturday, June 10, 2006

Michael's Pattern Laws

Here are some laws I found in the last years as a software architect I'd like to share of you. Maybe, you could share your own insights.

Patterns are no surrogate for human intuition and creativity.
Overload of patterns in your software architecture implies overload of problems in your system. However, not using patterns where applicable may make your life extremely unpleasant.
If you just found that cool new pattern, think again before bothering the rest of us! (Remark: I also want to remind you of Brian Foote's famous words: " a pattern is an aggressive disregard of originality").
Patterns are your best friends if they are treated with friendliness in your architecture design.
It is easy to become a pattern author but it is surprisingly hard to write a good pattern description.
If grandma understands it, it is propably a good pattern description.
Patterns that can be easily formalized are no patterns.
Patterns and Agility? Patterns are about agility. Without patterns your software architecture tends to become overly complex and thus hard to maintain, change, or evolve.
The number of pattern books has significantly increased in the last years but that is not necesssarily a sign of good quality.
Patterns are dead, CORBA is dead. All technologies that turn from hype to pragmatic technologies are declared dead once upon a time. If a technology is considered that way, it is typically safe and valuable to use.
A pattern is no island. It reveals its true power when connected with other patterns to form complete landscapes.
You can classify patterns in infinite ways. For example in structural, behavioral or creational patterns such as GoF. Or you may partition the pattern space with respect to granularity, process phases or domain facets. I prefer to have only two classes of patterns, good ones and bad ones.
Time is money. Applying patterns saves time. Thus, patterns are money! Don't forget to tell that your managers.
Always add a real life example to your pattern descriptions because some non-software people in your projects won't see the value of patterns otherwise. Take management as an example.
If you are a good guy, apply patterns. If not, anti-patterns might be a more appropriate choice.
Patterns are not just collections of UML diagrams. I totally agree with Bertrand Meyer who once said "bubbles don't crash" and "all you need is code". Developers should memorize those sentences.
Applying the right patterns the right way is like paradise. Applying the wrong patterns or applying the right patterns wrongly, however, is like hell. Thus, make sure you know what you are doing here.
Sure, you got all pattern books. But that doesn't make you a pattern expert automatically.
As the pointy-haired boss always ephasizes in Dilbert "work smarter not harder". Applying patterns is generally considered smart.
Beware of Murphy's Laws when applying patterns.