Introduction
The whole media worldwide is currently jumping on the AI bandwagon. In particular, Large Language Models (LLM) such as ChatGPT sound appealing and intimidating at the same time. When we dive deeper into the technology behind AI, it doesn‘t feel that strange at all. In contrast to some assumptions of the yellow press, we are far away from a strong AI that resembles human intelligence. This means, blockbusters such as Terminator or Bladerunner are not becoming true in the near future.
Current AI applications, while very impressive, represent instantiations of weak AI. Take object detection as an example, where a neural network learns to figure out what is depicted on an image. Is it a cat, a dog, a rabbit, a human or something different? Eventually, neural networks process training data to compute and learn a nonlinear mathematical function that works incredibly well for making good guesses (aka hypotheses) with high precision about new data.
On the other side, this capability proves to be very handy when dealing with big or unstructured data such as images, videos, audio streams, time series data, or Kafka streams. For example, autonomous driving systems strongly depend on such kind of functionality, because they continuously need to analyze, understand and handle highly dynamic traffic contexts, e.g., potential obstacles.
In this article, I am not going to explain the different kinds of AI algorithms such as types of artificial neural networks and ML (Machine Learning) which may be part of a subsequent article. My goal is to draw the landcape of AI with respect to software architecture & design.
There are obviously two ways of applying AI technologies to software architecture:
- One way is to let AI algorithms support software architects and designers in their tasks such as requirements engineering, architecture design, implementation or testing - which I’ll call the AI solution domain perspective.
- The other way is the use of AI to solve specific problems in the problem domain, why I’ll name it the AI application domain perspective.
AI for the Solution Domain
LLMs are probably the most promising approach when we consider the solution domain. Tools such as GitHub Copilot, Meta Llama 2 and Amazon CodeWhisperer help developers generate functionality in their preferred programming language. It seems like magic but comes with a few downsides. For example, you never can be sure whether an LLM learned its code suggestions from copyrighted sources. Nor do you have any guarantee that the code does the right thing in the right way. Any software engineer who leverages an application like Copilot needs to look over the generated code again and again to ensure the code is exactly what she or he expects. It requires software engineering experts to continuously analyze and check LLM answers. At least currently, it appears rather unlikely that laymen may take over the jobs of professional engineers with the help of LLMs.
Companies already have began to create their own LLMs to cover problem domains such as industrial automation. Imagine, you need to develop programs for a PLC (Programmable Logic Control). In such environments, the main languages are not C++, Python or Java. Instead you’ll have to deal with domain-specific languages such as ST (Structured Text = Siemens SCL) or LD (Ladder diagram). Since there is much less source code freely available for PLCs, feeding an LLM with appropriate code examples turns out to be challenging. Nonetheless, it is a feasible objective.
AI for the Application Domain
In many cases Artificial Neural Networks (ANNs) are the basic ingredient for solving problem domain challenges. Take logistics as an example where cameras and ANNs help identity which product is in front of a camera. Other AI algorithms such as SVNs (Support Vector Machines) enable testing equipment to figure out whether a turbine is behaving according to its specification or not, which is commonly coined Anomaly Detection. At Siemens we have used Bayes-Trees to forecast the possible outcome of system testing. Reinforcement Learning happens to be useful for successfully moving and acting in an environment, for example robots learning how to complete a task successfully. Another approach is unsupervised learning such as k-Means Clustering which classifies objects and maps them to different categories.
Even more examples exist:
Think about security measures in a system that comprise keyword and face recognition. Autonomous driving uses object detection and segmentation in addition to other means. Smart sensors include ANNs for smell and gas detection. AI for preventive maintenance helps analyzing whether a machine might fail in the near future based on historical data. With the help of recommender systems online shops can provide recommendations to customers based on their order history and product catalog searches. As always, this is only the tip of the iceberg.
Software Architecture and AI
An important topic seldomly addressed in AI literature is how to integrate AI in a software-intensive system.
MLOps tools support different roles like developers, architects, operators and data analysts. Data analysts start with a data collection activity. They may augment the data, apply feature extraction as well as regularization and normalization measures, and select the right AI model which is supposed to learn how to achieve a specific goal using the data collection. In the subsequent step they test the AI/ML-model with sufficient test data, i.e. data the model has not seen before. Eventually, they version the model & data and generate an implementation. Needless to say that data analysts typically iterate through these steps several times. When MLOps tools such as Edge Impulse follow a No/Low-Code approach, separation of concerns between different roles can be easily achieved. While data analysts are responsible for the design of the AI model, software engineers can focus on the integration of the AI model in the system design process, as the MLOps envoronment generates implementation of the model.
Software engineers take the implementation and integrate it into the surrounding application context. For example, the model must be fed with new data by the application which reads and processes the results once inference is completed. For this purpose, an event-driven design often turns out to be appropriate, especially when the inference runs on a remote embedded system. If the inference results are critical, resilience might be increased by replicating the same inference engine multiple times in the system. Docker containers and Kubernetes are perfect solutions, in particular when customers desire a scalable and platform-independent architecture with high separation of concerns like in a Microservice architecture. Security measures support privacy, confidentiality, and integrity of input data, inference results and the model itself. In most cases, inference can be treated from a software engineering viewpoint mostly as a black box that expects some input and produces some output.
When dealing with distributed systems or IoT systems, it may be beneficial to execute inference close to the sources of input data, thus eliminating the need to send around big chunks of data, e.g., sensor data. Even embedded systems like edge or IoT nodes are capable of running inference engines efficiently. In this context, only inference results are often sent to backend servers.
Operators finally deploy the application components onto the physical hardware. Note: a DevOps culture turns out to be even more valuable in an AI context, because more roles are involved.
Input sources may be distributed across the network, but may also comprise local sensor data of an embedded system. In the former case, either Kafka streams or MQTT messages can be appropriate choices to handle the aggregation of necessary input data on behalf of an inference engine. Take processing of weather data as an example where a central system collects data from various weather stations to forecast the weather in a whole region. In this context we might encounter pipelines of AI inference engines, where the results of different inference engines are fed to a central inference engine. Hence, such scenarios comprise hierarchies of possibly distributed inference engines.
Architecting AI models
Neural networks or other types of AI algorithms expose an architecture themselves, be it a MobileNet model leveraged for transfer learning, a SVN (Support Vector Machines) with a Gaussian kernel, or a Bayes decision tree. The choice of an adequate model has significant impact on the results of AI processing. It requires the selection of an appropriate model and hyperparameters such as learning rate or configuration of layers in an ANN (Artificial Neural Network). For data analysts or those software engineers who wear a data analytics hat a mere black box view is not sufficient. Instead they need a white box view to design respectively configure the appropriate AI model. This task depends on the experience of data analysts, but may also imply a trial-and-error approach for configuring and fine tuning the model. The whole design process for AI models closely resembles software architecture design. It consists of engineering the requirements (goals) of the AI constituents, selecting the right model and training data, testing the implemented AI algorithm, and deploying it. Consequently, we may consider these tasks as the design of a software subsystem or component. If an aforementioned MLOps tool is available und used, it significally can boost design efficiency.
Conclusions
While the math behind AI models may appear challenging, the concepts and usage are pretty straightforward. Their design and configuration is an important responsibility that experts in Data Analytics and AI should take care of. MLOps helps separate different roles and responsibilities which is why I consider its use as an important development efficiency booster.
Architecting an appropriate model is far from being simple, but resembles the process of software design. Training an AI model for ML (Machine Learning) may take weeks or months. As it is a time consuming process, the availability of performant servers is inevitable. Specialized hardware such as Nvidia GPUs or other dedicated NPUs/TPUs helps reduce the training time significantly. In contrast to the amount of required training efforts, optimised inference engines (-> Tensorflow Lite or Lite Micro) often run well and efficient on resource constrained embedded systems which is the concept behind AIoT (AI plus IoT).