Hitchhiker's Guide to AI, Software Architecture, and Everything Else

If you are a software engineer: DON'T PANIC! This blog is my place to beam thoughts on the universe of Artificial Intelligence and Software Architecture right to your screen. On my infinite mission to boldly go where (almost) no one has gone before I will provide in-depth coverage of architectural and AI topics, personal opinions, humor, philosophical discussions, interesting news and technology evaluations. (c) Prof. Dr. Michael Stal

Sunday, January 28, 2007

The Singleton Patterns and The Concept of Identity

During my Java EE 5 tutorial at the TOOP 2007 conference, an attendee asked how the Singleton pattern should be best leveraged in Java EE. In my opinion the Singleton pattern is more a workaround than a proven solution and thus it is seldomly recommendable to apply it.
Basically, what the Singleton does is to guarantee the existence of one single instance of a class.

Example from JavaEE:

// get ConnectionFactory via JNDI
ConnectionFactory cf = ... // single instance
Connection c = cf.CreateConnection(...);

The factory uses the Singleton pattern such as in:

class ConnectionFactory {
static private ConnectionFactory _instance = null;
protected ConnectionFactory() { ... };
static public ConnectionFactory instance() {
if (null == _instance)
_instance = new ConnectionFactory();
return _instance;
}
}

For sake of brevity I didn't care about thread safety. For our purpose you can just ignore this fact. In the code you will recognize that no external user of the class can access the protected constructor so that the only way to obtain an instance is calling the static factory method instance().
This way, there will be exactly one instance whenever one or more clients access the class.

There are many problems with this approach. The most important one is that a Singleton defines a global variable such as ConnectionFactory.instance().

Suppose, we need exactly one object instance. Do we really need to rely on Singleton?
Let me introduce a simple example.

class ThePrinter {
private String printerAddress;
public void print(String fileName) { ... }
public ThePrinter() { printerAddress = MY_CONSTANT_VALUE; }
}

In the class above, we can provide multiple instances and each of these instances will semantically define the same object (thus we should provide appropriate equals and hashCode contracts).

From an identity viewpoint, we need to constrain our class to provide objects with the same identity. The way a Singleton tries to guarantee this is by only providing one single instance. But it is legal and much more appropriate (in terms of OO pureness) to achieve the same result using an immutable class as shown in the printer example above.

Another option of course would be if we need to constrain the numbers or kinds of instances created by a class such as in a Manager or Pool class. These only allow creating specific instances or a maximum number of instances. Thread and connection pools are typical examples for such behavior. In this context, a Singleton is a Pool with capacity == 1. I won't show you more code here. You can google for Peter Sommerlad's Manager pattern.

Another "inverse" kind of approach in terms of identity is the following. Suppose, you got thousands of users on your Web site. You started with providing one volatile user object for each online user which made your system less fault-tolerant and more resource consuming. How could you change this? In other words, how could you provide a limited number of instances of user objects (# user objects << # online users)?
a) implement a pool of fixed user objects that wait for incoming requests (Active Object pattern)
b) when a request comes in, it is enqueued
c) one of the idle objects dequeues the request
d) it retrieves the user id from the request and looks up the user's data in the database
e) it operates on the user data and stores it back to the database
f) the result is returned to the client
g) the user object is ready to take another request and thus another identity

This approach does even work if multiple user objects run under the same user identity. In this context, each pool object can temporarily take any identity because the identity's data itself is persisted somewhere else. Thus, identity is not the same as instance.

Keep this in mind when you next time feel addicted to solve an identity problem with the Singleton pattern or other weak approaches.

It is surprising how often architects and engineers forget to think about identity and how to design and implement it.

Thursday, January 25, 2007

Architecture Refactoring

Refactoring as introduced by Martin Fowler is an excellent practice for bottom-up and structure-preserving gardening activities in your implementation. Refactorings are documented in a canonical form that resembles pattern templates to some extent. A typical example is Extract Interface. Assume, you've designed otherwise unrelated classes, each of them implementing the methods store() and load() for serialization. To treat these classes polymorpically, extract the common methods to a common interface, for instance IStorable that provides these methods and is implemented by the aforementioned classes. In an agile setting, refactoring denotes an essential tool, because the incremental integration of new use-cases keeps the tactical architecture aspects in flux which means that implementation artifacts require continuous modifications. You should "embrace change" as Kent Beck once said instead of trying to circumvent it - which, as we know, is impossible.

As you might remember, Joshua Kerievsky came up with "Refactoring To Patterns" in 2004. The basic idea behind his excellent book is to identify places in an implementation that use proprietary solutions for problems which should be better solved using patterns. I also want to give you an example for this approach. Suppose, you have implemented a class ShareValues that contains a list of consumers. Whenever a share value changes, you iterate through the list and call event handler methods on each of these objects. While simple to implement, the flip side of this approach is the tight coupling unnecessarily introduced. As soon as new consumers are added, we have to manually modify the ShareValues's list of consumers. Obviously, applying the Observer pattern comes to our rescue here. The advantage of refactoring to patterns is that we are able to introduce a higher abstraction level.

If you think that even further, we enter the world of what I call Architecture Refactoring. In Architecture Refactoring we refactor the architecture itself.

Possible Examples:

Partition Responsibilities: If a component or subsystem got too many responsibilities, partition the component or subsystem into multiple parts, each of which with semantically related functionality.
Extract Service: If a subsystem does not provide any interfaces to its environment but is subject of external integration, extract service interface.
Introduce decoupling layer: If components directly depend on system details, introduce decoupling layer(s).
Rename Entity: If entities got unintuitive names, introduce appropriate naming scheme.
Break Cycle: When encountering a cycle on subsystem level, break it.
Merge functionality: If there is broad coheshion between two modules, merge them.
Orthogonalize: If two parts of an architecture introduce different solutions for the same problem, choose one preferred solution and eliminate the other.
Introduce strict layering: If in a layered system, a layer accesses lower layers without necessity (relaxed layering), enforce strict layering.
Introduce hierarchies: If several entities are only variants of a particular entity, introduce a hierarchy.
Introduce Interceptor hooks: If we have to open an architecture for out-of-band functionality according to the Open/Close principle interceptors should be introduced.
Eliminate dependencies by dependency injection: Reduce direct and wide-spread dependencies of Parts in a Whole/Part setting by introducing a central runtime component (Whole´) that centralizes dependency handling with dependency injection.

So far, these are only suggestions of potential candidates. But you can see the possibilites now. Obviously, we should collect a whole catalog of such architecture refactorings. Note, that higher-level refactorings might incur lower-level refactorings the same way architecture patterns often refer to design patterns. Extract Service will very likely lead to an Extract Interface refactoring if the implementation already exists. Orthogonalize is heavily leveraging refactoring to patterns.

From my viewpoint, we have just started to understand and apply refactorings in a more holistic context. As the old saying goes, we have just seen the tip of the iceberg. Architecture Refactorings will guide architects to identify potential problems in a software architecture and also provide them with refactorings to solve those issues.

Sunday, January 21, 2007

How to start with requirements

Often when I am involved in software development projects as a software architecture consultant, team members are wondering how to start in terms of requirements engineering and how to proceed once all the requirements are available. As this is an issue that is critical for project success, I will elaborate on it in this and upcoming postings. Note however, I won't cover tools such as DOORS, or IBM/Rational. Here is my short take on this, tools in a nutshell if you wish so. Michael's rule of tool usage: Tools are relevant iff they give more support than they incur trouble :-;

So, let us assume we are starting a new project. For simplicity, suppose we are in a company that produces healtcare products. Some senior managers in our company found out that it is time to build a web-based framework for physicians where patients can obtain information, ask for prescriptions online and arrange meeting schedules. Product management asked several customers for theirs required and desired features and it turned out that customers would buy such kind of product. Thus, we got our business case.

As a software architect, the first step is to interview stakeholders about their requirements. The problem here is that senior managers and product managers are often far away from software architecture and technology. Hence, it is our task to clarify requirements with them and translate management level requirements to technical requirements. For example, I often hear sentences such as "this must be performant and scalable and secure and flexible." Yes, but what exactly do they mean by "flexible" or by "performant"? What if both requirements contradict each other? Another issue here is that different stakeholders often have different perspectives. A project manager in a multi-site software development project might be more interested to partition subsystems in such a way that each subsystem is developed by only one site, while a software architect thinks more problem-space oriented. Political issues might also be lurking such as sites which are striving for being responsible for specific parts of the product. Obviously, outsourcing and offshoring denote important factors as well. All of these issues need to be addressed by software architects. In order to deal with them, architects must disclose the relevant issues from day one. Here, it turns out how important communication skills are for architects if you ever doubted it.
My first recommendation is: software architects should arrange meetings where they interview all relevant stakeholders. Thus, identify stakeholders with senior management. Then take the time to interview all of these stakeholders. Prepare yourself with checklists where you list questions you want to ask the stakeholders. Otherwise, it might be a problem to get full coverage or to obtain comparable answers. Note, that these checklists differ depending on the personal backgrounds of interviewees. It is a common rule that senior managers only seldomly can give technology or architecture advice (at least not explicitely). In the interviews clarify requirements. Ask interviewees what they exactly mean with the requirements. You should also be prepared to ask more concrete questions and guide the interviews. For example: our online patient system framework should be flexible. What does flexible mean - parts can be removed, added, exchanged, removed or does it refer to modularity? How fast must changes be committed? How many changes are anticipated? Can these changes be added at runtime or at compile time or at maintenance time? What kind of changes do typically occur? Don't forget to ask interviewees about the priorities they give these requirements. If they tell you, all requirements are of identical importance, ask them questions like: "If you had to choose one of these requirements as the most importan one what would it be?" or "What are the features that must be available in the beginning and what do you consider the highest risk?".
After the interviews set up a kick off meeting where all architects and stakeholders are invited. Present your findings to them and also prepare a rough proposal of the project setup (milestones, work packages, activities, requuirements with priorities and risks). After the meeting all the attendees should commit to a final project plan, have a common understandings of terms, and what exactly the requirements mean, and what their risks and priorities are.
In this context, it is important to mention that there are different types of requirements:

Functional requirements define the concrete domain-specific problem (in terms of the problem space not of the solution space!). Here a DSL is a good start. Or at least, should be one of the results of domain modelling.
Operational requirements such as performance or fault tolerance relate to the operation of the final product.
Developmental requirements define architectural issues such as flexibility.

Customers are often more interested in functional and operational qualities. These qualities will also have a huge impact on the strategic core of your product. Developmental qualities are mostly tactical issues. They are of high relevance for the development team, especially when we are going to build a product line such as in our example project.

Often, it helps a lot when defining the boundaries of the system to be built using context diagrams. Additionally, a use case analysis with participation by customers is essential.

It is important to mention that a software product line such as the one introduced in our running example requires special focus on commonalities and variabilities of the products we are going to build. For example, some of our customers might need the patient meetings to be synchronized with Exchange Server, while others might use IBM/Lotus, or even exotic platforms. Thus, our patient framework must be built in a way that allows this kind of variability. Another variability may be the different looks and feels of the Web sites.

After you got final agreement and committment on the requirements you can start to proceed to the next step and set up the project which won't be covered in my postings. However, it turns out to be important to asign important subsystems to key developers and also assign the most important non-functional issues also to key developers. Software architects are then responsible for the whole picture, but key developers for their own topic of concern. In the beginning, architects and (and key developers) will design a base line architecture and communicate this architecture to the stakeholders. The design should reflect functional aspects but also at least the core non-functional concerns.

In our example, security and usability will be most likely the most important requirements. Thus, we should organize two teams, one responsible for security and the other one for usability.

In the whole project activities is essential to ensure requirement traceability. Each design or implementation decision should be motivated by requirements and follow the architectural baseline. This is where architects must act as supervisors.

This concludes the first posting in my "series".