Hitchhiker's Guide to Software Architecture and Everything Else - by Michael Stal

Sunday, July 22, 2012

Reviewing systems - the sum is more than its parts

The profession of a software or system architect is not only about creating new systems, it is also about assessing and improving existing systems. If change is the rule and not the exception, architectures grow and change continuously. To keep them sustainable, review and assessment techniques are of uttermost importance.

An architecture review consists of different phases:

  • In the clarification or scoping phase we need to define the goal of the architecture  review as well as the one to three key questions the review is supposed to answer. For example, a goal could be to validate that our new system architecture implements the desired dependability quality attributes appropriately. One of the key questions could be: “can the software architecture achieve five nines of availability?”
  • In the analysis or information phase we need to read documents, interview stakeholders, check code and test cases, watch demos, among other things, so that we are able to answer the key questions.
  • In the evaluation phase, reviewers investigate the strengths, weaknesses, opportunities and threats imposed by the current system and related to the review goal. If there are risks in the system, reviewers should define recommendations to mitigate these risks.
  • In the feedback phase, all information (findings, recommendations) provided by the review is returned back to the team responsible for system development.   

These phases roughly correspond to the model specified by RUP (IBM Rational Unified Process): Inception, Elaboration, Construction, Transition. 

Unfortunately, there are many different types of reviews and review techniques, qualitative and quantitative reviews, scenario-based and experience-based reviews, code, design and architecture reviews. So which one should we choose for which purpose?

In my experience, it is often best to conduct an experience-based review as outlined  in the description of the phases, but integrate other review techniques as well, for example:

  • If quality attributes are in the main scope of the review, ATAM (Architecture Tradeoff Analysis Method) could be added to concretize quality attributes using scenarios and compare them with the actual architecture, thus determining the risk themes.
  • For obtaining information from stakeholders on architecture issues ADRs (Active Design Reviews) are a complimentary means instead of relying on interviews only.
  • Quantitative assessments such as metrics, prototypes, simulators help to obtain more detailed information about the system under review and its capabilities and limitations.
  • Code and design reviews help reviewers gain more insights about the details of the system. Of course, the code and design reviews are constrained to the parts relevant for the overall review goal.

The toolbox of the software or system architect should contain the relevant review techniques. Whenever architects are involved in a review, they should determine in the clarification phase which techniques they will use in the subsequent phases.

Note, that reviews do not need to follow a waterfall model. You may and should use an agile approach with answering the most relevant key question or its most important aspects first using time-boxed increments. 

Such reviews may require weeks, but can also be conducted as flash reviews on one day. The more you are able to narrow the review scope, the less effort it takes. If you conduct regular reviews after each increment, the architectural deltas are typically small which lets you narrow the scope of your regular reviews which can then be done in a day. The findings are used to refactor and improve the system architecture.

Reviewers should be experienced in software/system architecture as well as in review techniques. For building up a review culture in your organization, start enabling the lead architect to conduct reviews. Then, use the master-apprentice model to constantly increase the review skills in the rest of the team.

But as mentioned earlier: do not rely on one single review method but establish a toolbox of review and assessment techniques that can be used in combination to enforce architecture sustainability.

Saturday, January 07, 2012


DSLs are currently being promoted by a large number of activists. If you ever read the seminal Pragmatic Programmers book, you'll also find such a recommendation there. So, will the software engineering universe soon turn into a paradise? Let me tell you two examples from industrial practice: Once upon a time, in the logistics domain, a smart and lazy software engineer introduced a new language just for himself, in a very ad-hoc manner. When they saw the result, all his colleagues got very excited about the concept and started extending the language for their own purposes. The language inevitably grew and grew. Soon, an increasing number of system parts were depending on the language. Unortunately, when our smart inventor reached a high age, he had to leave and let his colleagues alone. But now, there was no one left in the company who really understood the language concept and its realization. It is claimed that the language keeps still growing. Other theories assume the system had to be completely rewritten in the meantime. In another universe and another time, another software engineer was convincing his fellows to introduce DSLs. All the IT dwarfs went crazy inventing their own languages. Yes, they even made competitions who could come up with the most fasinating or useful language. After a while, the DWARF software system was nothing but an ocean of languages. Only wizards were able to generate a working solution. When a furious dragon (= customer) attacked their town, everything imploded. No one has a clue where the remains of the dwarf town are located. What can we learn from these failures? DSL design is (like) architecture design. An ad-hoc language will lead to design erosion, and cause more harm than good in the long run. DSLs represent strategic design tools that should only be in the hands of experienced designers. There are purposes for which DSLs are a perfect fit. But there are also circumstances where they shouldn't be applied. Overdoing it increases accidental complexity. It is like component-based re-use. Each DSL should only have one responsibility. Plan to grow the DSLs systematically, refactor it and verify its correctness. Communicate with all stakeholders that are affected by the DSL. Consider the design choices for the language syntax. Should it be a text-based language, a graphical language, or both? Mind the XML hell - long, long ago almost everyone was striving for XML based languages. But in some rare contexts XML made writing configurations and documents ineffective, uncomprehensible, and tedious. I've seen systems wirh thousands of XML files. Let experts build and grow languages. It is amazing how (often) our discipline gets addicted to all kinds of panacea. Once you are lost in the buzzword jungle, all systems and stakeholders will suffer fom DSL paranoia. DSLs are an awesome means for boosting productivity. However, used by the wrong persons or with the wrong motivations, it is easy to shoot yourself in the foot.

Wednesday, October 12, 2011

Architecture Governance

In a large project a product line is developed that supports multiple smart phones. The platform development team puts all code in its super magical configuration management system. Whenever one of the phone teams starts with a product development, it extracts the platform code from the CM system, and adapts it to its own needs. With other words, even the core assets (i.e., the platform itself) are modified. Of course, the platform team is not aware of all these activities. C'est la vie.

This is an example for reuse, albeit one which rather serves as a war story. How come? Imagine, the next version of the mobile platform is going to be developed. How can the domain engineers ever evolve a system that was modified by various product development groups in different ways? The product line has been literally transformed in a bunch of one-off applications by unsystematic reuse. One approach to prevent such architectural war stories is what we call Architecture Governance.

Architecture Governance is a systematic approach for managing architectures and controlling all modifications in order to ensure quality and sustainability. This holds for all modifications, those for developing the system and those for evolving it.

But who is in control? On one hand, there should be an architecture control board in charge of the technical architecture, and on the other hand, another control board should be in charge of the business and business strategy - Luke Hohmann once coined the terms "tarchitecture" and "marchitecture" in this context.

It s important to think about governance in general. Architecture governance is no island. It must be balanced with IT Governance, SOA Governance, and other classes of governance. Governance is about preferring control and monitoring over trust.

WTF does "control" mean? In the product line example, it means we need to introduce an owner of the platform. And we need someone responsible for business & strategy. Only these two "persons" are allowed to decide about modifications of the platform in terms of business and architectural sustainability. Of course, the business control board is supervising the architecture control board. Eventually, it is the business that drives the development.

For architecture control, guiding principles should provide policies for modification and evolution. Obviously, the internal and external quality of the system as well as its expected behavior must never be compromised. That's why we need tools to check and enforce policies, tools to assess the architecture, and test suites to obtain respective information. This is where monitoring comes in. By the way, "tools" in this context also include reviews. And mind the gardening activities! How can a system be systematically prepared for modification. This is where reengineering and in particular refactoring become important.

But there are even more details to bother about:

We need to address legal and regulatory issues. Any change must not violate such standards. Think of safety features for medical products as an example.

It is important to care about quality of service. Assume, a modification leads to a breakdown of KPIs or SLAs.

Don't ignore or neglect quality attributes in general. Does a modification influence sensitivity or tradeoff points? Will it introduce new risks?

An issue might also be patent scanning. Is there any new code introduced such as an Open Source Software that violates intellectual property rights?

But it is not only about tools. It is also about a governance process as well as assignment of responsibilities to roles and persons. For example: Who is allowed to change, configure, update, or add what and when. What measures are necessary in order to guarantee quality and sustainability. How does information flow between the different actors. What happens in the case of policy violations.

Note: This does not only hold for product lines but also for one-off applications. Unsystematic modification almost always leads to potential problems, because of unwanted side affects, accidental complexity, and lack of transparency.

What does this imply for the design of new architectures? It is the same issue: we need a business owner, an architecture owner, a design and implementation strategy, test-driven development, refactoring, and so forth. I happen to have introduced all these topics already in my blog :-) Modifying and evolving a system is only a special case of designing it.

By the way: An agile development process supports architecture governance in that its iterative incremental approach already introduces control and monitoring points.

In practice, there are many ways to deal architecturally with architecture governance. I cannot describe all of them for the sake of brevity. But let me give you some examples:

Using the Layers patterns in a strict sense helps to protect subsystems. Think about Safety segregation.

Clean Coding represents a good way to make control and monitoring easier.

Well documented software architectures are easier to modify and control.

There are many architectural means to foster governance.

Keep in mind that architecture governance is not about controlling architects or developers. Instead, governance helps architects and developers to keep in control. It requires additional efforts but its RoI is very high. Ignoring governance leads to project failure, overcomplex systems, buggy implementations, and design erosion. And eventually to dissatisfied customers and users (and dissatisfied architects and developers). Don't let the software govern you. Govern the software!

Thursday, October 06, 2011

Steve Jobs, 1955 - 2011

Born in 63, I had the great opportunity to grow up with personal computers. Even around the Eighties Woz and Steve Jobs were already legends for their visionary minds and creations. I liked the first LISAs and Macs but unfortunately could't afford to buy one.

When Steve had to leave Apple in 1985 he created the NeXT which has been an extraordinarily successful machine, not so much from a business perspective but from a creative one.

But his popularity exploded eventually after Steve Jobs rejoined Apple as CEO. Steve Jobs stands for creativity, vision, courage, leadership, charisma. He honestly lived and worked literally day and night for Apple and its users. In contrast to many competitors he gave products not only functionality but personality and style.

What only few people know is that Steve also had a strong influence on software. Not only on UI topics but also on programming languages, operating systems, and many other aspects.

I personally have never been an Apple "fan boy" before I bought my first iPod, but now I love their products. However, I must confess that other mothers also have beautiful daughters. Nonetheless it is evident that Apple has always been the driver of amazing innovations that were frequently copied by competitors.

That's the reason, why I want to thank you, Steve. Many things we take for granted simply wouldn't exist without you. I fell deep respect for you and your work. I am sure, you'll always be unforgotten as one of the leading personalities in the IT Hall of Fame. And I really hope, your spirit will always persist within Apple.

The last thing I learned from you, although this has been a sad and bitter experience is: Carpe Diem! And that we should see the people behind and in front of all these nice products.

Send my greetings also to Doug Adams. So long, and thank you for all the apples!

Tuesday, September 20, 2011

Using War Stories

For the education of software architects we are using war stories to emphasize important learnings. The whole curriculum is based on the mantra of learning from failure. Errare humanum est! Thus, it is important to see what can go wrong and how to deal with it in a better way. Everyone of us architects knows a whole bunch of war stories from the own career. I also caused failures, but learned from them. It is basically the same like children when they learn to walk. They'll fail but keep on trying until they eventually succeed. It is not a shame to think about own failures. Some cultures force people to always appear perfect which is a bad fundament. As we all encounter failure, it is better to learn what exactly went wrong and why instead of just hiding it from others and ourselves. When we educate architects we do not just teach them theory but also and even more practice. As Einstein once said, "in theory, theory and practice are the same. In practice they aren't."

Friday, September 16, 2011

Fractal Design

If you are going to design a system, you basically need to identify prioritized main use cases and quality attribute scenarios among other forces to design the system. Together with a problem domain model that shows the inner constituents, and a context model that integrates the system into its environment, the design can evolve in a evolutionary way. As a result of the design activities, architects are able to introduce subsystems as well as their relations according to functional or qualitative drivers. And, of course, interfaces start to appear, each of them defining one specific and explicit role of the subsystem.

A subsystem is itself integrated into an environment – the enclosing system under development. So the subsystem can also act as a system. Thus the same principles apply for the subsystem acting as a system itself – you may even define use cases and quality scenarios for the subsystem with use cases and scenarios being derived as a subset of the “outer” use cases and scenarios. After this step, “subsubsystems” are created, often coined “components”.

What we do here is applying the same design principle in a top-down manner.

But is it useful or possible to apply the principle infinitely? No, because at some level the solution domain is shining through. Solution domains tend to introduce their own composition techniques such as assemblies, bundles, EJBs, services, classes and objects. If the top-down design approach reaches this level, designers must map the architecture artifacts to the solution domain. Maybe, we should call this level architectural twilight zone or the problem-solution-boundary Smile

Side remark: To overcome the twilight zone, we could introduce DSLs and use Model-Driven Software Development.

As a rule of thumb, we typically obtain 2 sublevels (subsystems, subsubsystems) as architects. If less, the design is too abstract and vague. If more, we are introducing too many details.

One of the core challenges in this context is the fact that there might be different strategies to view a domain and thus different ways to cut a system into subsystems.

Partitioning a system into subsystems independent of the hierarchy level is influenced by functional aspects and the problem domain. Thus, subsystems should introduce meaningful subdomains of the surrounding problem domain. With other words: methods like Domain-Driven Design together with some extensions can help nicely.

No matter what you do, there will always be crosscutting qualities and topics. The same observation can be made when structuring an organization into divisions, departments, groups. Have you ever seen an organization without overlapping units? The introduction of crosscutting concerns may introduce new sub^nsystems, add new interfaces or even change their implementation depending on the invasiveness of the concrete concern. Each concern can be considered a subordinate view mixed into the subsystem respectively its domain.

Architecture design is basically fractal design up to two levels of depth. The priorities of use cases and quality scenarios as well as their properties (strategic versus tactical) define how and in which order the functional model needs to be refined hierarchically by integrating scenario-based views.

Of course, one person’s solution domain can be another person’s problem domain which is why exceptions to the aforementioned rule might apply.

Thursday, September 08, 2011

The Telephone Test for Software Architecture

We all know that a software architecture should reveal two properties among many others for an adequate internal quality:

Simplicity implies that the software architecture only addresses inherent complexity without introducing accidental complexity. Since, there are typically several ways to solve a problem, there is no simplest architecture available. Instead, there rather are solution architectures that follow one specific solution path with a minimal number of artifacts. As some quotes propose, simplicity is achieved if you cannot take something away from your system without failing to meet its specification.

Expressiveness implies that the artifacts of your architecture are easy to understand. That is, artifacts should have expressive names, and each responsibility should be assigned to one artifact. Thus, components with a multitude of responsibilities are often a bad idea such as are responsibilities spread across multiple components. However, it is particularly difficult to achieve the latter goal due to cross-cutting concerns. An additional step to achieve expressiveness is having role-based, explicit interfaces with concrete contracts.

But how can you test simplicity and expressiveness? There is a good low-tech suggestion for doing this: Let a software architect explain the architecture to an engineer not involved in the project, for instance using a phone call. Limit the time to - let's say - 10 minutes. If the other engineer gets a good idea of the architecture, it is an indication that the architecture is simple and expressive. Of course, I am assuming that the software architect is a person good in communication as I expect from architects, anyway.

Some might argue that design metrics could also help in this context. Indeed, metrics provide some insights. But we shouldn't forget that metrics analyze the structure, not the semantics. Thus, they are not capable of deciding about expressiveness.