Hitchhiker's Guide to AI, Software Architecture, and Everything Else: March 2023

Models and Modelling - A Philosophical Deep Dive

Motivation

Not only in software architecture we use models for designing and documenting systems. Models are also indispensible in other engineering disciplines and in natural sciences. We all experienced good and bad models in our daily lifes. What is a model really about? And how does a good model look like? Let us enter a (philosophical) discussion about this topic.

What is a model?

A model captures the essence of a domain. It focuses on the core entities and the relationships within a domain from a specific viewpoint, i.e., serving a specific purpose. A model contains rules that must hold for its constituents. Models are used by humans or machines to communicate about the respective domain for a particular purpose.

Examples of models include:

a UML diagram

a street map

a floor plan

an electronic circuit diagram

a problem domain model (DDD)

quantum theory

mathematical formulas

Consequences:

(i) The same domain can be represented using different models, each capturing another viewpoint of that domain. This viewpoints are often briefly called views.

(ii) Models can be informal or formal depending on their usage as a means for communication. Thus, they must be easily understandable and comprehensible by stakeholders.

(iii) Models introduce abstraction layers by using generalization and specialization leaving out „unnecessary“ respectively irrelevant details.

(iv) A model does not describe reality but a subset of reality viewed from a specific angle.

(v) Languages are based upon models. A model can be viewed as a language, and vice versa.

(vi) A model may support a graphical presentation or a textual presentation, it even may include both.

The complexity of a model is directly proportional

to the number and types of its entities and their relationships,
to the kinds and numbers of abstractions being used,
to the complexity of its underlying rules.

A good model:

provides a proper separation of concerns (SoC)
consequently applies principles such as the single responsibility principle (SRP), Don’t-Repeat Yourself (DRY), KiSS, or the Liskov Substitution Principle (LSP) in order to gain the highest understandability and comprehensiveness
uses expressive names for all its abstractions, entities, dependencies
provides an effective and efficient means of communicating among stakeholders
focuses on essence and leaves out everything that does not serve the required purpose of the addressed viewpoint
avoids accidental complexity strictly and consequently
allows to model simple things in a simple way, while being capable of expressing complex things in a doable way

Stakeholders

The creation of a model should be guided by its (types of) stakeholders, in particular by the way they intend to use the model. In this context a meta model helps define how the set of creatable models should look like. Thus, meta models constitute modeling languages. They help create different models or views.

To define an adequate model that serves an intended purpose all (human) stakeholders should be involved. UML is an example of a modelling language that serves the needs of software engineers but (often) not those of many domain experts. In fact, domain experts might have their own models readily available. While a model might be perfect for machine-machine communication, it aint’t necessarily adequate whenever humans are involved. The more formal a model is, the easier it can be processed by computers. Humans often need more informal and expressive models instead. If both kinds of stakeholders are involved, we need to balance between formal and informal approaches.

Emojis are an example of an informal model. They can be immediately understood by a human, but may be more difficult to process by a machine.

Artificial Neural Networks albeit “simple” can be processed by machines very well, but are hard to be understood by a human - i.e., with respect to what they actually do and how they work.

UML is somewhere in the middle of these extremes.

Fortunately in many mature domains, models already exist. An electrical circuit defines a proven concept of a model. Mathematics is often considered a uniquitous language with predefined notations. In the context of software engineering, domain models are often implicitly defined and have been established as common sense in an organization. If software engineers with no or little domain expertise start to develop software applications for the respective domain, they need to make the implicit model explicit. Otherwise they cannot design a software architecture that meets the customer requirements. This is what DDD (Domain-Driven Design) is all about. It tries to come up with a domain-specific model using generic building blocks such as DDD patterns and techniques.

The representation of a model should fit the needs of its stakeholders. For humans graphical notations often work very well, because they explicily reveal their structure in an easy manner and are good to grasp and to handle. Due to productivity reasons, textual models may be more beneficial and flexible in some cases. As an example consider software code. For a beginner graphical code blocks might work very well, while advanced programmers prefer coding textually, because they can mentally map seamlessly between the “graphical” design and the textual code representation. Handling code graphically might just reduce their productivity, effectiveness and flexibility due to all clutter and constraints.

Model Transformations

To keep many stakeholders satisfied a possible approach might be to introduce different models for different types of stakeholders and also create mappings between these models, for example an easy to understand UML model that is transformed into a machine readable XML schema.

Actually, software engineers are used to handle different models that are mapped onto each other. In software engineering a compiler represents a model transformation from a high level language to a system language or interpreter. A UML diagram might be transformed into high level language code. A low-code/no-code environment creates domain-specific applications from high level user specifications. However, model-to-model transformations can be quite complex, in particular when the gap between models is very large and if no common-off-the-shelve solutions for the transformations are available. Moreover, the more models the more transformations are necessary. Note: a model transformation might also be done manually if the model is not too complex and mapping rules are pretty straightforward.

Model sets

In domains such as building contruction or software engineering multiple views are necessary to represent information from different angles. Take design view, deployment view or runtime view as examples in the software engineering domain. In addition, their might be different model abstraction layers, for example, an in-depth design view versus a high level software architecture view. In other words, to solve a task we need a model set instead of a single model that captures every detail from every perspective.

No matter how the views differ from each other, there needs to be meta information to tie the different views together. Prominent examples are the mapping from a view to code, and the implicit or explicit relation of views with each other. Note: there might be different solutions respectively model kits for the same problem context, e.g., RUP’s 4+1 view in contrast to TOGAF might not be the (only) solution of choice for designing an enterpise system.

No matter what model set you choose, make sure that it is used consistently. In most cases tool support is strongly recommendable. Models can become very complex. Therefore you need a tool to draw, check and communicate the concrete models. This is the main reason why most software engineering activities rely on some sort of UML environment such as Enterprise Architect or MagicDraw.

A ground plan is different from an electricity plan. All models together are necessary for building construction. In this example, there might also be rules and constraints across all models respectively views. For example, an electrical cable should have a minimum distance to a water pipe. Consequently, we need some kind of verification algorithm to check whether rules/constraints are violated.

Model Creation

Models shall never be created in a big bang approach. They are living entities that change over time the more experience you obtain. They may start very simple but become more complex over time. Whenever they are overengineered, they need to be simplified/refactored again. Model creaters need to ensure that models can be facilitated and handled by stakeholders easily. If stakeholders have different viewpoints at the same problem, create a model set where each model view serves a particular set of stakeholders.

To start creating a model for a domain context, we should figure out whether such models already exist, and if this is the case, whether these models can serve the desired purpose(s). It is always beneficial to use existing models, in particular due to the experience and knowledge they carry. So, don‘t reinvent the wheel if not absolutely necessary, especially if you are no expert in the domain.

If no model exists, stakeholders should jointly create a model (set). It is helpful if at least one of the stakeholders is experienced in creating models while at least some other person is a domain expert.

If models exist that do not serve the intended purpose, we might change and adapt these models to fit our needs.

Note: a common mistake is to first focus on the syntax of a model. Instead, initially think about its semantics and find a good syntactical representation afterwards.

No matter how a new model is created, learning its representation should happen in a quick and straightforward process, even for unexperienced stakeholders.

Interestingly, most graphical models consist of rectangular or other symmetrical shapes, arrows, lines and textboxes, while textual models often use regular or context-free grammars. The reason for this observation is that this way the models are comprehensible and their handling is easy. It should also be possible to draw a model manually in order to discuss it with other stakeholders before documenting it. Sitting around a modelling tool significantly decreases productivity, at least in my experience. A whiteboard or a flipboard is by far the best tool for modelling. This can be complimented by an AI software that recognizes manually drawn models and transforms them to clean and processable data representations.

Summary

In this blog posting I did not reveal any new or innovative stuff you didn‘t already know. Neither was it my intent to provide anything revolutionary. It is just a summary of modelling and how to approach it. And if you started thinking about modelling from this more philosophical view, I‘d be happy.

Hitchhiker's Guide to AI, Software Architecture, and Everything Else

Sunday, March 26, 2023