Hitchhiker's Guide to Software Architecture and Everything Else - by Michael Stal

Tuesday, September 20, 2011

Using War Stories

For the education of software architects we are using war stories to emphasize important learnings. The whole curriculum is based on the mantra of learning from failure. Errare humanum est! Thus, it is important to see what can go wrong and how to deal with it in a better way. Everyone of us architects knows a whole bunch of war stories from the own career. I also caused failures, but learned from them. It is basically the same like children when they learn to walk. They'll fail but keep on trying until they eventually succeed. It is not a shame to think about own failures. Some cultures force people to always appear perfect which is a bad fundament. As we all encounter failure, it is better to learn what exactly went wrong and why instead of just hiding it from others and ourselves. When we educate architects we do not just teach them theory but also and even more practice. As Einstein once said, "in theory, theory and practice are the same. In practice they aren't."

Friday, September 16, 2011

Fractal Design

If you are going to design a system, you basically need to identify prioritized main use cases and quality attribute scenarios among other forces to design the system. Together with a problem domain model that shows the inner constituents, and a context model that integrates the system into its environment, the design can evolve in a evolutionary way. As a result of the design activities, architects are able to introduce subsystems as well as their relations according to functional or qualitative drivers. And, of course, interfaces start to appear, each of them defining one specific and explicit role of the subsystem.

A subsystem is itself integrated into an environment – the enclosing system under development. So the subsystem can also act as a system. Thus the same principles apply for the subsystem acting as a system itself – you may even define use cases and quality scenarios for the subsystem with use cases and scenarios being derived as a subset of the “outer” use cases and scenarios. After this step, “subsubsystems” are created, often coined “components”.

What we do here is applying the same design principle in a top-down manner.

But is it useful or possible to apply the principle infinitely? No, because at some level the solution domain is shining through. Solution domains tend to introduce their own composition techniques such as assemblies, bundles, EJBs, services, classes and objects. If the top-down design approach reaches this level, designers must map the architecture artifacts to the solution domain. Maybe, we should call this level architectural twilight zone or the problem-solution-boundary Smile

Side remark: To overcome the twilight zone, we could introduce DSLs and use Model-Driven Software Development.

As a rule of thumb, we typically obtain 2 sublevels (subsystems, subsubsystems) as architects. If less, the design is too abstract and vague. If more, we are introducing too many details.

One of the core challenges in this context is the fact that there might be different strategies to view a domain and thus different ways to cut a system into subsystems.

Partitioning a system into subsystems independent of the hierarchy level is influenced by functional aspects and the problem domain. Thus, subsystems should introduce meaningful subdomains of the surrounding problem domain. With other words: methods like Domain-Driven Design together with some extensions can help nicely.

No matter what you do, there will always be crosscutting qualities and topics. The same observation can be made when structuring an organization into divisions, departments, groups. Have you ever seen an organization without overlapping units? The introduction of crosscutting concerns may introduce new sub^nsystems, add new interfaces or even change their implementation depending on the invasiveness of the concrete concern. Each concern can be considered a subordinate view mixed into the subsystem respectively its domain.

Architecture design is basically fractal design up to two levels of depth. The priorities of use cases and quality scenarios as well as their properties (strategic versus tactical) define how and in which order the functional model needs to be refined hierarchically by integrating scenario-based views.

Of course, one person’s solution domain can be another person’s problem domain which is why exceptions to the aforementioned rule might apply.

Thursday, September 08, 2011

The Telephone Test for Software Architecture

We all know that a software architecture should reveal two properties among many others for an adequate internal quality:

Simplicity implies that the software architecture only addresses inherent complexity without introducing accidental complexity. Since, there are typically several ways to solve a problem, there is no simplest architecture available. Instead, there rather are solution architectures that follow one specific solution path with a minimal number of artifacts. As some quotes propose, simplicity is achieved if you cannot take something away from your system without failing to meet its specification.

Expressiveness implies that the artifacts of your architecture are easy to understand. That is, artifacts should have expressive names, and each responsibility should be assigned to one artifact. Thus, components with a multitude of responsibilities are often a bad idea such as are responsibilities spread across multiple components. However, it is particularly difficult to achieve the latter goal due to cross-cutting concerns. An additional step to achieve expressiveness is having role-based, explicit interfaces with concrete contracts.

But how can you test simplicity and expressiveness? There is a good low-tech suggestion for doing this: Let a software architect explain the architecture to an engineer not involved in the project, for instance using a phone call. Limit the time to - let's say - 10 minutes. If the other engineer gets a good idea of the architecture, it is an indication that the architecture is simple and expressive. Of course, I am assuming that the software architect is a person good in communication as I expect from architects, anyway.

Some might argue that design metrics could also help in this context. Indeed, metrics provide some insights. But we shouldn't forget that metrics analyze the structure, not the semantics. Thus, they are not capable of deciding about expressiveness.


Wednesday, September 07, 2011

Apples and Oranges

How often do we hear statements like how immature software engineering is compared to other engineering disciplines? Of course, construction building is a very old craft where engineers could collect huge amounts of knowledge, methods, and experiences over thousands of years.

On the other hand, there is a huge gap between traditional engineering disciplines and software engineering:

  • While other engineering disciplines are focused on specific domains, software engineering is supposed to support countless problem domains.

This is why we came up with technologies that are more general in terms of problem domains such as:

  • UML
  • Architecture Definition and Specification Languages
  • Generic Design Patterns (GoF)
  • Components and Services

However, there is a huge trap using such approaches. Due to their general and generic nature, they are far away from the realities of the problem domain. Thus, communication between software engineers and non IT-knowledgeable stakeholders does suffer.

It is interesting how many experts still emphasize those general-purpose tools and technologies. So to speak, they are addicted to the One-Size-Fits-All drug.

For instance, look at architecture description languages that introduce general components and connections. The problem is that for a concrete problem domain, such generality simply does not work. It is a difference, whether you are dealing with a medical imaging modality or a VoIP platform. Yes, we can … call all building blocks of such systems components or subsystems. And, yes we can … call all interaction paths connections. But, here we are unifying different concepts under one common umbrella. 

What software people often forget is that there are two universes:

  • the Solution Domain with all its supporting tools and technologies. In this domain, general-purpose approaches as the aforementioned make sense,
  • AND the Problem Domain.

In the Problem Domain we can not naively leverage concepts like UML, Components or General Design Patterns. Instead, we need to follow a Problem Domain First Approach. Thus, it is necessary to start with concepts of the problem domain. What does that mean in practice?

  • Introduce DSLs and Domain-Driven-Design to cover the problem domain instead of relying on UML. As a matter of fact we can use the underlying meta-model of UML as a base. It is possible to evolve such a language in an iterative approach.
  • Think about the basic building blocks not as components and subsystems, but use the terminology of the problem domain. You can define your own problem-specific components, subsystems, or services for this purpose. This is exactly what we did for a very large Enterprise Communications System.
  • Consider the availability of Analysis Patterns for your problem domain and subdomains. Use them if available.

Building the Software Architecture then consists of mapping the problem domain to the solution domain. How could we do that? I mentioned the Onion Model several times. So steps could be:

  • Understand the problem domain and build a model of it jointly with domain experts
  • Leverage use cases to understand what the system under development is supposed to deliver from a black-box view.
  • Use the problem domain model to map the use cases to the problem domain artifacts introducing further functional aspects.
  • Stepwise extend the architecture by using strategic quality-attribute scenarios with descending priorities.
  • Prepare the architecture for tactical requirements by using the tactical quality attribute scenarios with descending priorities.
  • Do all this just considering three abstraction levels to limit complexity.
  • Apply architecture patterns to structure the overall system.

Mind the Change! Unlike in many other disciplines, software is considered so soft that it should even support strategic late changes. This is like switching your project goal from creating a coffee machine to creating a power plant. Unfortunately, customers are rarely aware of this problem because they are being constantly told by sales and marketing that changes are no problem. Thus, make customers and maybe even more sales and marketing aware of this fact. And use an agile approach to embrace change. But also have courage to deny changes. This is another reason why the Problem-Domain-First approach is so important. How else could you know what changes or extensions are typical for a problem domain? Think of the tax legislation in your country. It is not helpful to only think of interception and extension hooks or Strategy pattern instantiations in this context.

All general approaches for software engineering have their value and can be used as the underpinnings of your systems. But they should not be used for covering the whole software development from problem domain to solution domain. This is also what we can substantially learn from other engineering disciplines.

Thus, don’t mix apples and oranges. Otherwise, the result may disappoint you.

Sunday, September 04, 2011


As explained in my last posting, emergence is a powerful approach for solving some hard problems in a non-deterministic way. It's a kind of magic? Not really, it is more an architecture pattern.

What do we need for implementing emergence?

We require

a set of active agents,
a (set of) communication mechanism(s),
a common goal the agents implicitly or explicitly share,
an environment,
cooperation strategies of agent (optionally),
a method to determine when to stop (optionally).

The agents are active in that they execute some functionality in order to reach a goal or subgoal. They communicate with other agents to transfer information. And they recognize when their work is done or when they should stop.

There is a lot of possible variation here:

Agents might all share the same role (behavior) or they may have different roles.
Agents might be organized in a peer-to-peer fashion or hierarchically.
Agents might share the same goal or have individual or role-based goals.
Control might be distributed across all agents, or only be assigned to one or a multiple of them.
Agents might be preinstantiated at start-up or adapt their number or roles according to environmental conditions or achievement (being born, living, dying).
Agents might be able to give life to new agents. They may also be able to terminate other agents, even themselves.
There even could be some environmental conditions that result in death or birth of agents.
Communication might be one-to-one or one-to-many or abide to any other message exchange pattern.
Communication might be synchronous or asychronous.
Agents might adapt their behavior including communication due to environmental changes or interaction with other agents.
Which implies, the environment might change.
Agents might "move" in their environment.
There could also be diverse neighbor environments.

In a Map-Reduce system we have a controlling agent and subordinate agents. The controlling master might instantiate slaves depending on problem complexity. Goal of the system is to reach a common goal such as finding all specific entities in an environment. The master communicates with the slaves and provides a subgoal to each of them. The slaves solve the subgoal and may act as masters for their subgoal, thus recursively applying the pattern. Slaves send their findings back to their master which then recombines partial results to the common result.

In Leader-Followers, the leader is always the controlling agent. When it receives an information from its environment, it promotes a follower to become the leader, and switches its role to worker. A worker who is done with a job, switches its role into follower and enters the followers list.

All these systems reveal emergent behavior. And this also resembles swarms like ant populations. There are different roles such as queen, warrior, worker, etc. The population's goal is to sustain by ensuring the survival of the population and their offspring as well as giving birth to new populations. When searching and locating food sources, ants use pheromons for communication. They show warrior strategies for fighting enemies. And they always adapt to their environment in terms of food, war, weather conditions. Ants might even adapt distribution of their roles such as creating more warriors when necessary. The whole population appears as a kind of smart creature.

A human might also be considered as a whole consisting of smaller agents (cells) and this is how evolution actually developed complex life forms.

And if you think about it, the Web itself is just an emergent system.

When we are going to build such an emergent system, we need to consider all these aspects defined above such as roles, hierarchies, communication styles, goals, controls, environment, adaptation, cooperation strategies.

One promising way to implement such systems are Actor-based approaches.

If we had a toolkit for emergent system that allows to configure the different variation points, we could play around with the concept. Akka is one of the excellent frameworks that could serve as a base for such a toolkit.

So, eventually emergent systems are developed using emergent design approaches. Until then, there is a large journey ahead of us.