Hitchhiker's Guide to Software Architecture and Everything Else - by Michael Stal

Sunday, August 16, 2015

What the hell is Software Architecture

Since the Nineties many definitions of software architecture have been proposed, most of them being very vague. Eventually, the common denominator of these definitions suggests that software architecture denotes a set of cooperating components. In my opinion, this view is rather simplistic. Most entities in the universe follow the same definition which makes the definition useless from an engineering perspective.

Instead of providing yet another definition of software architecture we should think about its properties.

(i) Software Architecture is both a process and a thing: the process of architecting comprises a sequence of strategic decisions, while the thing is the result of this process. The goal of software architecture is to create the backbone for implementing the specification. Note, that we do not assume a specific software development paradigm here such as Lean or Agile Development.

Remark 1: Strategic decisions refer to requirements and additional forces that affect the whole architecture and result in design artifacts that are tightly coupled with the rest of the architecture which makes them difficult and expensive to change. Examples include mandatory functional properties of the system, operational qualities such as performance or security, infrastructures for modifiability such as a Plug-in Architecture, constraints caused by the system context such as hardware prerequisites.

Remark 2: All architecture design decisions must be driven by the specification and the business goals. Put briefly: No decision without a (good) reason.

(ii) Architecture design spans the whole lifecycle of a system not just its creation. It starts with initial planning and requirements engineering and ends when the system(s) built upon this architecture reach their end. Between these various points in time it covers creation, maintenance and evolution.

(iii) In a naive sense every software-intensive system reveals a software architecture, even if it has been created in a complete unsystematic or unintentional way, i.e., using ad-hoc decisions. What we need is systematic architecture design driven by well defined and prioritized architecturally-relevant requirements as well as risks. Sometimes, some or even all parts of an existing system with an unknown or partially known software architecture need to be (re-)used. In this case software archeology methods are required to help extract the hidden software architecture and make it explicit.

(iv) There are two kinds of architecture quality, external quality and internal quality. While the former one defines the externally visible behavior as demanded by quality attributes, the latter defines the habitability of architectural artifacts (such as simplicity or expressiveness) by developers, testers, etc.

Remark: a consequence of habitability is the limitation of software architecture design to a small number of hierarchical entities such as system, subsystem, components. This entities should follow the Single-Responsibility Principle. Accordingly, the responsibility of fine design and implementation is to refine and extend these abstractions to provide executable artifacts.

(v) For architecture as a process a consistent set of guidelines and tools shall guide the architecture design in order to ensure high quality and business alignment. Without such guidelines the architecture will be overloaded with multiple styles, idioms, patterns, concepts, technologies, paradigms, conventions, all of which reduces internal quality.

Remark: One major challenge is taming inherent complexity while avoiding accidental complexity. The latter one can be caused by using the wrong solutions, applying the right solutions incorrectly.

(vi) Software architecture as a thing is a means for communicating design decisions to other stakeholders. Thus, all decisions must be made explicit in a comprehensive way. The various constituents of the architecture shall be provided to readers in adequate ways depending on their roles and goals and their responsibilities and expectations. For this purpose, software architecture documentation must offer a consistent and complete set of architectural views.

Remark 1: Since software architecture is a thing and a process, its documentation includes the sequence of design decisions and their rationale which relates architecture views and process and which enables requirements and decision traceability.

Remark 2: Software architecture creates a base for design & implementation. It defines an initial (walking) skeleton for deriving the code base. This is why habitability is of foremost importance.

(vii) A software architecture is not an island but embedded into a context. Thus, it is important to strictly separate the architecture from its environment, while at the same time considering and defining the interfaces and interactions between both. Otherwise, it won't be possible to come up with an appropriate software architecture. A context view and use case views are examples that help address these aspects.

(viii) Software architecture must cover both problem domain and solution domain. For this reason, a Multi-Tier design is not an architecture. To avoid monolithic designs both domains should be hierarchically structured into subdomains and their relationsships. The organization of software architecture activities shall be driven by the aforementioned subdomains, not by the line organization. With other words, mind Conways law!

(ix) Software architecture is not necessarily constrained to a single system. It might also define the base for a set of systems within a specific problem domain. In such reuse contexts a Commonality/Variability analysis is required to define a common base architecture which will be modified for a particular implementation context before fine design and implementation start.
Examples: product lines, ecosystems, platforms/infrastructures, libraries.

Remark: This is what reference architecture and product-line architectures are about.

(x) Architecture design must follow a test-driven approach and communicated in such a way that the architecture can be tested. Testing provides relevant information to the architects such as revealing quality issues or design flaws. It includes quantitative and qualitative architecture reviews.

I can't resist. Let me conclude this posting with yet another definition of software architecture:

Software Architecture
is a process, i.e., a sequence of intentional strategic design decisions which map specification and business goals to architecture design.
is a thing, i.e., a set of views that address different stakeholders and are the results of the architecture process.

The main challenge for software architects is: there are different ways to design a good software architecture, but there are infinite ways to create a bad one.

Addendum July, 17th

Some may wonder why software architecture should be considered a process as well.

It is insufficient to obtain some architecture views, i.e. the What. In addition we might need information how the architecture has been created and why it is as it is, i.e. the How and the Why. This is not essential for some stakeholders such as users, but it provides relevant information to developers, customer service, and testers.

One notable example is the detection of design flaws. When we encounter a flaw in our system, we'd like to know when the design flaw has entered the "crime scene" and what other decisions depend on this flawed decision. This gives us traceability and the possibility to rollback in a systematic way. Likewise, process knowledge is important for architecture reviews.

Wednesday, May 13, 2015

A Matter of Waste

If a queue has a capacity of 100 units, we may enqueue 10 entities with a volume of 10 units per entity. 10 times 10 equals 100, right?

If we find a parking space with exactly the same length as our car, our search will come to an end - assuming the parking car must be parked parallel to the sidewalk. Right?

Hmmmh, the last one does not feel right. But what is the problem? It is that we need some additional space to maneuver. This extra space could be considered as waste, but it is in fact a precondition for parking operations. We may call this quality attribute "parkability".

A similar problem is the issue of a rectangular container with a volume of 100 ought to be filled with 100 spheres of volume 1 each. Obviously, we can't expect 100 spheres to fit into a container with the size of all spheres summed up. Again, the challenge is caused by inherent extra space needed by spheres.

Do we encounter such challenges in software design as well? Of course!
To be precise, we experience "waste" in two different areas: in our own activities as engineers and in our systems.

Let me start with the latter one that is primarily caused by technology and design constraints as well as by inherent properties of the problem domain. If you got this cool new 64-core machine with Terabytes of RAM, you may want to increase the efficiency of your apps by 64 times. Unfortunately, waste comes into our way again. Competition for resources and inherent dependencies in the problem context will cause the performance increase to be significantly lower than the desired factor of 64. And we did not even consider accidental complexity in this scenario.

Interestingly, it is such dynamic overhead that makes it difficult to deal with operational qualities or to build realistic simulators.

But how does this waste issue relate to software engineering activities? Basically, it is the same calculus again.

If we are committed to two different activities with overlapping time spans that need our attention, there is a chance of conflicts. Two ressources are competing for your time. As you cannot focus on two activities at the same time, you'll have to focus on one, which leads to a kind of time debt in the other one.

Another approach is switching back and forth between two activities. However, this requires the engineer to stay synchronized with the current state of each activity. Experts in multithreading call this "context switch". Time needed for such housekeeping duties inevitably causes waste.

Another typical root of waste is when a consultant or engineer of a company is supporting a custumer, because it is necessary to deal with two organizations such as organizational overhead for vacation, payments, ordering items, time accounting, meetings.

Often engineers do not consider such overhead, because they underestimate the problem instead of making waste explicit and finding ways to reduce it. But keep in mind: It is impossible to remove all waste, in particular inherent one.

Experience tells us that at least (!) one fifth of time allocated to an activity will be consumed for "non productive" issues. Thus, we always need to explicitly add a waste factor representing idle time.

Another reason for waste is communication. Think about reading mail, joining meetings, handling phone calls, chating with colleagues, drinking coffee. In a magazine I recently read that such interrupted work is a major issue in companies causing incredible amounts of overhead. This is due to the fact that when humans are experiencing a non maskable interrupt they'll need to reboot their brain before refocusing and continuing with the interrupted work.

So, if you need to reduce productivity of colleagues significantly play the interrupt game. Tom DeMarco once told me about two printer companies in the US where he could relate meeting time and productivity. The more (mandatory) meetings the less productive a company will be. Maybe, this pattern could be called "death by meetings". If you appreciate further patterns, start reading Dilbert by Scott Adams.

- Posted using BlogPress from my iPad

Thursday, March 05, 2015

Technical Debt - The Downside of Metaphors

The term "debt" is a metaphor from economy. Its use for software development seems to be very reasonable. Whenever developers fail to address quality issues in their software, they will have to pay this debt back. That is quite simple, isn't it?

However, we can easily find weaknesses of the term "technical debt":

When a system reaches its end of lifecycle, all debt will be gone. Try this for financial debt.

Financial debt is a separate entity, i.e. two debts do not intertwine, while software debt sometimes can't be easily located, isolated or separated, because it is woven into the system.

Technical debt may stay unpaid, if software engineering comes to the conclusion that the cost for refactoring would be higher than keeping the debt untouched.

Financial debt is created intentionally. This is certainly true for some technical debt issues as well such as temporary workarounds. However, many kinds of technical debt are accidental without architects even recognizing it.

Technical debt is destructive. If possible, you would like all of it to be eliminated. In contrast, financial debt often is more constructive - such as increased cash flow.

Financial debt is independent of other system parts, while technical dept in one place can be affected by other parts, and vice versa. An example would be modification of the system that makes the code parts containing the technical debt obsolete.

Interest rates are mostly determined by banks and usually stay fixed over the specified time. In the context of design debt the interest rates increase the longer the debt is not treated. The pay back time is usually not predetermined. And the same kind of debt may be more expensive to pay back in one system part than in other parts.

While almost all forms of financial debt are paid back continuously until the debt plus the interest rates are fully covered, each single technical debt in most cases must be paid back at once.

There is one kind of debt in the financial domain, while technical debt has many different shapes.

If you think about it, you'll find other metaphors that are more convincing. For example, a tumor or infection has many similarities with what we call technical debt: It can occur accidentally, or intentionally if you don't take care of your health. It must be paid back at once - you are either ill or not, which isn't quite true in practice, but very close. The longer it is not treated, the worse the health condition will become - think of viruses that spread and damage other organs. Infections and technical debt are mostly destructive. Both can be monitored in imaging devices or labor diagnostics. There are tumors that are difficult to get rid of and that cause severe damage, while others are harmless if treated early. The worst ones can even become lethal. So my first guess would be to call it software infection instead of software debt.

Another possibility is to use environmental pollution as a metaphor. Like technical debt it may occur incidentally or (most of the time) accidentally. You must pay it back, but you can do this part by part. If left untreated, the situation worsens. It is purely destructive. Often, it can't be easily separated from the environment. Thus, another possibility would be to call it software pollution. Software Architecture Sustainability is a metaphor that is related to environmental issues. And so is the term "Software Ecosystem".

Other metaphors could be considered as well such as terms for similar issues in hardware or building architecture (consider design erosion).

My personal favourite is Software Infection (respectively technical infection, code infection, design infection). Of course, this metaphor has its own liabilities, but it is much closer to its cousin in software development than the debt metaphor is.

And it sounds appropriate to speak about a sick component or architecture. Or to visualize an infection using a modality like in medicine.

I know that it is too late to get rid of the expression "technical debt", but at least we should handle it with care. Some metaphors that sound good in the beginning may turn out to be not well aligned with what we like to visualize.


Saturday, February 28, 2015

Are Patterns like Mummies?

In 1994, the Gang of Four, Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides published their groundbreaking book on design patterns. At almost the same time we were creating the first POSA book (Pattern-Oriented Software Architecture) which was eventually published in 1996. The software engineering community got flooded by the Pattern Wave. And soon there arrived an inflation of pattern books, some of them excellent, but many of them of mediocre quality.

I remember, that almost every software development magazine addressed patterns regularly and enthusiastically in the Nineties. The GoF became the Beatles of architects and developers. And most experts anticipated a myriad of new patterns rising at the horizon.

Actually, several pattern books have been published until now, but without the impact the GoF had. If you ask software engineers about the patterns they know, almost all will mention GoF patterns, many may illustrate some POSA patterns as well, and only a few will come up with other patterns such as those in Martin Fowlers book on Enterprise patterns.

This could suggest, that no other important patterns can be found in software habitats anymore, because GoF already got them all. But even if this were the case for general purpose design patterns, shouldn't there be some excellent patterns lurking in more specific domains? Pattern experts have tried to come up with complete pattern languages for their domains. In theory, such pattern languages are beneficial. In practice, it is impossible to cover a medium to large-sized domain with a pattern language because of the inherent complexity involved. Thus, it is not surprising that existing languages cover only tiny domains or fail to completely cover larger domains.

Does the whole universe only know the GoF patterns and that's it? In this case the seminal GoF book would be the holy grail of software development. If not, where do all the unknown patterns hide?

Let us assume we are asked to improve the GoF book, what exactly would we change. Patterns like the Null Object Pattern or the Extension Interface pattern (see POSA Vol. 2) could be added, the Singleton pattern removed and other patterns such as Abstract Factory improved. Patterns are not carved in stone but subject to future changes, albeit not with a high evolution speed.

If you look at patterns from 30000 feet, you'll recognize that there are some benefits of patterns not related to coding.

Pattern forms are a good mental tool to document design. For example, they comprise a name, a context, a problem with forces, and a solution. This helps document all kinds of architecture decisions in a structured way

Patterns are also applicable to document best practices for transformations, data representations, and many other topics. For example, each refactoring can be considered a transformation pattern. Best Practices are ideal candidates for patterns

Patterns introduce an idiomatic viewpoint, as they define a language. Effective usage of software platforms in terms of best practices for frameworks, libraries, APIs and protocols are idiomatic as well. Experience shows that idiomatic approaches help better understand structures and concepts. In software engineering all structures are idiomatic. Languages change and so do Patterns.

The value of patterns is not their content, but also their usage as good mental tools, with the capability of addressing all activities. Patterns help share best practices with others. They unfold a language of idioms that provide understandability, maintainability.

Patterns are dead. Long live patterns.

- Posted using BlogPress from my iPad


Tuesday, February 24, 2015


Recently, I discussed architecture design with some colleagues who are involved in Product Line Engineering projects. When the term "Reference Architecture" came up, it became obvious after a while that they used the term specifically for Product Line Architecture, while I had Reference Architectures in mind, which are typically created to steer and guide standardization. They might also present architecture styles that are commonly accepted within a domain. Think of compilers which are in most cases structured as a Pipes & Filters architecture which is composed of a lexer, a parser, a semantic evaluator, an optimizer, and a code generator.

A Reference Architecture (RA), on the other hand, is a coarse grained architecture template different organizations use for guiding their design activities. You may remember the OSI 7 Layers model or the OMG Reference Architecture for CORBA, both invented in the Stone Age of software engineering. A RA is a blueprint which doesn't include any further assets but documents. In fact, it is not very specific but rather abstract.

A Product Line Architecture (PLA) is an architecture for the product line of an organization. It captures the commonalities of a set of similar products and defines variation points using feature models or meta models. It contains different core assets, some of which are ready for use. A PLA is much more specific than a RA and defines a framework with (partially) implemented artifacts and variation mechanisms.

An RA can be used as the core of a PLA. For this purpose, the RA is concretized in Domain Engineering by addressing requirements and constraints of an organization. In the process, engineers may provide additional variation points or bind existing variation points. In the latter case the PLA transforms a variation point to a commonality. If many organizations do this for the same variation point in similar ways, a new extended RA is born.

A PLA can become the base of a RA if it is subject to abstraction and considered as an established practice in the domain, i.e. if most PLAs will end up in the same RA.

- Posted using BlogPress from my iPad


Sunday, February 01, 2015

I am back

After a long time absence due to health issues I am finally back. I will add new content in the following months with the publication frequency increasing over time.

At the OOP 2015 conference which ended last friday I was in charge of the architecture track. In my own talks I covered internal & external quality as welll as ways to ruin a project by (wrong) architecting. The latter tutorial introduces failure patterns for harming software development from the perspective of architects. Unfortunately, there are uncountable ways for failure, but only few for success.

A software system may be considered as a living organism. If the organism is not robust enough, even small illnesses may cause large damage. Architecture is basically the infrastructure necessary for metabolism, neural transmission or structural integrity (such as the bones). If parts are damaged, e.g., organs, we may either put in new or artificial organs, or maybe only repair a small part.
The software organism is created in an evolutionary process with biological patterns and is subject to design erosion during its whole lifecycle. Obviously, it is important to check the organism regularly to identify potential health problems as soon as possible. With Architecture Tomographs these checks can be fast and intense. In some cases design erosion or cancer dissemination is high enough to make the system impossible to repair. In these situations we may better create a new organism. However, for less harmful kinds of design erosion we may refactor using patterns. In this context, external quality is every property we can observe from outside such as speed, reliability, ... Internal qualities comprise heart rate, blood pressure, insuline level, bone strength, ....

Of course, the model has its limitations, but for me it turned out to be really useful.

- Posted using BlogPress from my iPad

Monday, July 14, 2014

Micro Management for Micro Brains - why Micro Services suck

It sounds like a great idea. Instead of building a monolithic enterprise application, just split functionality into small components and run them independently in their own processes. These micro services need to cooperate with each other to provide more advanced functionality. For that purpose, they are going to communicate with other micro services.

Such an approach promises better flexibility in changing and deploying systems. If only a small part of the functionality is changed, this will affect only a small number of micro services. Or, as we like to say, "small is beautiful".

What a wonderful new architecture style. Eventually, we found the silver bullet. Agreed, micro services may not be a silver bullet, but, at least, they are silver shotgun shells.

This idea is not new, though. For example,  "actors" are providing the same kind of solution. They encapsulate fine grained services behind message-based interfaces that enforce argument and result types to be value types, so that no side effects may occur. In order to achieve a common goal, actors interact with each other.

"Micro services" contains the term "services" for a reason. Micro services basically introduce a stripped down resurrection of SOA.

However,  the delicious looking and tempting apple may be poisoned. There are several "challenges" when using this architecture style:

  • Complexity:  If we are building a non-trivial application, it will reveal inherent complexity.  For the moment, let us assume, there is no accidental complexity involved. When this application is based on micro services, where does the complexity go? Unfortunately, it does not go away, but manifests itself  in complex connection and cooperation patterns between tiny services. 

  • Modifiability: If our enterprise application is partitioned into small micro services, we only need to touch and redeploy small services for changing the system, instead of facing the mess of application monoliths. In theory, this is a perfect solution. In practice, it is not, because for non-trivial changes of micro services, we may need to rewrite  a lot of other micro services that are connected with the modified micro services. Remember, the complexity does not disappear but shows up in the topology of the micro services network. This does not only affect design but also evolution, assessment and refactoring activities.

  • Internal  Architecture Quality: Architecture is about strategic design. Its main purpose is to map the problem domain to the solution domain. If micro services are used as the primary concept for structuring functionality, the problem domain will be mixed up with the solution domain, especially, if the usage of micro services is not transparent. Thus, the architecture will very likely be technology-based instead of problem-based. Note, that this is another variant of Conway's Law: "show me your architecture and I will know the technologies it depends on".

  • External Architecture Quality: As we all have learned the painful way, main reason for architecture failure typically are not functionality aspects.  Main reason are quality attributes like performance, extensibility, fault tolerance, and so forth. For each of these quality attributes, well known design strategies and design tactics are available that present alternative solutions how to introduce the respective quality into the architecture. Architecture design uses utility trees and scenario diagrams to specify external quality requirements. These quality attributes are often crosscutting, which is why complex networks of services make it hard and sometimes even impossible to design and implement  quality attributes.

  • Infrastructure: We may use a technology stack for leveraging the micro services architecture pattern or we may build it ourselves. The latter option is a no go, because we are not in the middleware business, at least most of us aren't.