Hitchhiker's Guide to AI, Software Architecture, and Everything Else: Middleware

In many solution architectures communication middleware serves as the central backbone which helps to glue all core parts together. It is obvious that the selection of the appropriate middleware is a critical issue in any development project. Unfortunately, in many projects this decision is driven by political forces or personal preferences. Managers often do not recognize that the selection of inappropriate infrastructure technologies or tools can be disastrous. Even if architects are ready to base this decision solely on use cases and requirements, it can be incredibly difficult to move to the right direction. As Andrew Tanenbaum once said with respect to standards "there are so many that it is difficult to choose from". So, the first question is: what kind of middleware is appropriate for what kind of problem? To answer that question it is important to get an overview of middleware types and paradigms.

Basically, the following kinds of middleware exist:

Messaging Middleware: applications send messages to each other. Message contents are application-specific while the structure and header information is specified by the MOM (Message-Oriented Middleware). Messages might be sent from one peer to exactly one other peer (end-to-end or queue-based messaging) or (anonymously) from multiple publishers to multiple subscribers (publisher subscriber or topic-based messaging). Examples: MSMQ, MQSeries, JMS, SonicMQ, JBossMQ.
Remoting Middleware: hides all communication details from developers by extending conventional operation calls over the network. Clients and servers can almost be implemented as if they were residing in the same address space. All communication issues are handled behind the scenes by glue components that are typically generated using tools. Examples: RMI, WCF aka Indigo, CORBA, ICE, DCOM.
Eventing Middleware: focuses on distributing fine grained events. Very similar to messaging middleware and thus I won't cover it here in detail.
Distributed Transaction Monitors: provide transactions management across different components. Examples: Tuxedo, CICS, MTS. Will not be covered in this posting as transaction monitors today are rather integrated into the other types of middleware.
Service-oriented Middleware is a kind of meta-middleware used to integrate other middleware. Most prominent implementation are XML Web services. Web services are mostly useful in business-level integration scenarios.
Peer-to-Peer Middleware: a combination of Messaging middleware and Eventing middleware where the locating of remote peers happens through discovery algorithms instead of relying on centralized repositories.
I won't explain Multi-Agent-Systems here as they are only used rarely.
Neither I will explain EAI systems here as they are basically built on top of the aforementioned middleware types.

Therefore, the first distinction should be whether your problem domain requires a more method-invocation based or more message-based approach. Asynchronous operation is one of the important issues here. Despite of the fact that some remoting middleware technologies have also introduced asynchronous method invocations (e.g., CORBA), messaging is more appropriate for asynchronous communication. Asynchronous communication basically means, that the sender for its subsequent processing does either not expect a result from its communication peer or does not need the result now. Hence, senders and receivers should be decoupled. Example: sending a purchase order to the order processing subsystem. Another issue for the preferred usage of messaging is when you don't want tight coupling between communication partners. For example, a server that is sending notifications to different receivers which are interested to subscribe for obtaining different messages from different senders. A further example is the provisioning of advanced communication styles such as broadcasting or single request - multiple replies. These styles do not map to existing programming languages, anyway. Thus, it makes no sense to provide them through remoting middleware. Last but not least, scenarios where communication links are not reliable are more appropriate for messaging. Take a mobile phone as an example that lost connection with the network.

If, however, the sender needs an "immediate" result from a specific communication partner, remoting is much more suitable. Example: asking a credit card company for validation of a specific card in a Point-of-Sale system. The advantage of remoting middleware is the transparency they provide. Basically, developers can ignore all communication details. But that is exactly the problem. Developers often tend to leverage remoting middleware as if they were developing non-distributed systems without taking issues such as latency into account. You can easily imagine the performance of such systems. Another problem with transparency is hiding of details can be a disadvantage in terms of tracking errors. Because everything is hidden, error causes also are hidden. In summary, usage of remoting middleware and transparency are totally overestimated in practice.

Messaging middleware can be extended to provide remoting. There is a simple reason for this. Every remote invocation can be separated into two message transfers. A request message is transferred to a receiver which then sends back a result message to the originator. Actually, all remoting middleware is built upon some kind of messaging infrastructure. In contrast to some claims, the opposite is not true as asynchrony can not be guaranteed when remoting middleware is used as the underlying base for a messaging layer.

Peer-to-Peer middleware simply combines messaging and eventing with discovery strategies. Instead of looking up a receiver's location from a known repository, it is much more reliable to discover a resource, especially when the same resource is available more than once in the network. In a Peer-to-Peer system a consumer or sender does not care about which concrete provider or receiver it is using. Thus, Peer-to-Peer approaches are particularly helpful for coping with decentralization. In theory, a similar approach is possible for remoting middleware. Instead of asking a repository where the server object is located, it is possible to use a trading service, ask for a specific service type with specific properties, and get possibly multiple object references returned that meet the specified type and property constraints. This is the right way if you need a remoting based approach combined with decentralized lookup strategies.

SOA basically means loose coupling. For example, peers are implementation-agnostic. The only thing an application sees is the service interface of the application it need to communicate with. The communication itself uses a commonly agreed messaging protocol. This approach can be best implemented using XML Web services and Messaging middleware. SOA is particularly useful to integrate different middleware and application islands with each other.

This posting could only scratch on the surface. There are a lot of issues that could have been discussed such as error handling, fault-tolerance, scalability, security. My intention was to illustrate that no middleware paradigm can be a general solution for all problems. Thus, it turns out to be essential to first specify the problem space and then identify which middleware solution is the appropriate one. For the same reason, it is sometimes not possible to rely on one single middleware technology because there are multiple problems requiring multiple middleware solutions. Remember: select your middleware carefully and base this selection only on the problem domain, not on political or personal preferences.

Hitchhiker's Guide to AI, Software Architecture, and Everything Else

Thursday, March 30, 2006

Middleware - what Middleware?

No comments:

About Me