Saturday, September 28, 2024

Hitchhiker‘s Guide to Vogon Culture

The Hitchhiker's Guide to the Galaxy - Vogons: A Comprehensive Overview

Introduction

The Hitchhiker's Guide to the Galaxy famously notes that if there is one species you should never, under any circumstances, attempt to reason with, it’s the Vogons. Their love for bureaucracy, their complete lack of empathy, and their talent for creating the third worst poetry in the universe have made them a cosmic byword for misery and paperwork. The Guide goes on to describe the Vogons as "slug-like creatures" whose devotion to form-filling, stamping, and the regulation of intergalactic travel is rivaled only by their enthusiasm for torturing other species through mind-numbing bureaucracy.

This chapter offers a comprehensive guide to all things Vogon, including their language, legal systems, court scenarios, reproduction, and, of course, their utterly joyless social life. Read on at your own risk.

Chapter 1: The Vogon Language

Vogon language, much like the creatures themselves, is a harsh, guttural, and needlessly complex form of communication designed to cause maximum discomfort to the listener. The Vogon language is characterized by awkward consonant clusters, over-enunciated vowels, and a grammatical structure so convoluted that entire civilizations have been known to collapse after merely attempting to translate one Vogon court document.

Phonetics and Phonology

Vogon speech is designed to be as unappealing as possible, with a reliance on guttural consonants and nasal vowels. Words typically follow a C-C-V-C pattern (Consonant-Consonant-Vowel-Consonant), with consonants clustering at the beginning and end of words.

Example: Splornk (spaceship).

Stress patterns are equally jarring, with stress usually placed on the first syllable, though formal Vogon speech tends to randomly stress syllables to complicate things further. This makes the language sound as though the speaker is both complaining and issuing a bureaucratic directive at the same time.

Grammar

The Vogon language utilizes a highly convoluted grammar system that ensures even the simplest statement is buried under layers of nested clauses, bureaucratic redundancies, and arbitrary pronoun shifts.

  • Sentence Structure: VSO (Verb-Subject-Object) is the default word order. However, to confuse others (and themselves), Vogons often switch to SOV or even scrambled orders during formal exchanges.
  • Example: Nzzif kaak splornk. Translation: "I destroy the spaceship."
  • Verb Conjugation: Vogon verbs are conjugated for tense, mood, voice, and emotional state. In particular, verbs take on additional suffixes depending on how annoyed the speaker is, a feature that is used frequently.
  • Annoyance Level Example: Nzzifrk (I slightly destroy) → Nzzifgrrrrk (I completely destroy, out of utter frustration).
  • Pronouns: Vogon pronouns are unnecessarily complex, changing based on mood, hierarchy, and disdain for the listener.
  • Examples: Kaak (I, formal), Zagg (you, informal but polite), Thnggrz (they, bureaucratic third-party).

Advanced Vogon Poetry

Vogon poetry is considered the third worst in the universe, and for good reason. Full of meaningless metaphors, awkward sounds, and forced rhyme schemes, Vogon poetry is a tool of torture. Here’s an example:

"Fnorg tzzzl brrzl splornk,
Grzzrk Urrghk tssnif plrnkk.
Fnarrt zzgrk blorrrkfl tzrk,
Thrurg splurtz zzrrrk thrrknk."

Translation: "Life and paperwork is meaningless,
The love of form is all but endless.
Rot and decay fill my soul,
The smell of green is a terrible toll."

Chapter 2: Vogon Bureaucracy

Vogons have elevated bureaucracy to an art form. If you ever find yourself dealing with a Vogon, it is essential to understand that their minds operate on a single principle: the more forms, the better. There is nothing the Vogons enjoy more than filling out forms in triplicate and forcing others to do the same.

Legal Texts

Vogon legal documents are designed to confuse, confound, and trap the reader in an endless cycle of paperwork. Filled with multiple nested clauses, contradictory requirements, and circular references, a single Vogon legal document can take years to decode.

Example of a Legal Document:

Title: Form 78-L: Authorization for Minor Interplanetary Travel (Revision 56-J)

Section 1: General Requirements
"Splurnkfnrgrr splurtzn thzzrkfnr form 92-B, to be submitted in quadruplicate along with supporting documentation, referencing Appendix 12-F and Clause 43-E."

Translation: "The applicant must submit form 92-B, along with multiple supporting forms, unless an additional form overrides the request, in which case further clarification will be required."

Court Scenarios

Vogon courtrooms are places of immense tedium, where legal cases are decided based not on evidence or logic but on how many forms the litigants have filed. The judge, typically a Supreme Bureaucrat, evaluates not the arguments, but the accuracy of the paperwork.

Example Dialogue in a Vogon Courtroom:

Judge (Supreme Bureaucrat Fnarrpftjrl):
"Tssrkfnll splornkfnrgr splurtzzkr tssrkzn form 92-C, in compliance with subsections 43-F and 99-Q, I declare that the claimant has failed to submit the proper forms for cross-examination. Case dismissed!"

Claimant (Vogon A):
"Fnzzgrptzn kaak splurtfnrll form 42-B, thzzrkzn brrzlgn form fnzzkrk!"
Translation: "I have already filed form 42-B, with supporting documentation."

Chapter 3: Vogon Reproduction

One might hope that a species so dispassionate and bureaucratic would have an equally sterile approach to reproduction, and they would be right. For Vogons, reproduction is not an act of love or intimacy, but a tedious and necessary bureaucratic process, regulated by government forms and monitored by officials.

The Reproductive Process

  1. Filing for Reproduction Approval: Vogons must submit Form 54-R: Reproduction Request to the Bureau of Species Continuation. This form must be submitted in quadruplicate, along with supporting documentation such as the Individual Fitness Report and the Reproduction Quota Compliance Form.
  2. Official Selection of Mates: Mating is not a personal choice. The Bureau of Genetic Appropriateness selects mates based on genetic fitness and bureaucratic rank. Once paired, the couple receives an Official Reproduction License (Form 78-B).
  3. Clinical Reproduction: Vogon reproduction likely takes place in designated reproduction centers, where the process is monitored to ensure all steps are followed correctly. Physical contact is minimal, and reproduction may even be artificial, with in vitro fertilization preferred for its efficiency.
  4. Post-Reproduction Paperwork: After reproduction, the couple is required to submit Form 43-R: Reproduction Completion, outlining the details of the procedure. This is followed by Growth and Development Reports and Education Authorization Forms for the offspring.

Chapter 4: Vogon Social Life

If you're picturing a Vogon social gathering as a lively event filled with conversation, laughter, and joy, then you clearly have no understanding of Vogons. Social interactions among Vogons are cold, formal, and revolve around their one true passion: bureaucracy.

Social Hierarchy

In Vogon society, status is determined by one's position in the bureaucratic system. The higher a Vogon's rank, the more respect they command. The most powerful Vogons are the Supreme Bureaucrats, while the lowest-ranking are Junior Clerks tasked with filing endless paperwork.

Social Events

Vogons do not host parties, as those would imply enjoyment. Instead, they participate in form-signing ceremonies and public readings of bureaucratic documents.

  • Form-Signing Ceremonies: When a major form (such as an interstellar travel permit) is signed, Vogons hold a formal gala where the form is signed in triplicate, reviewed, stamped, and filed.
  • Public Readings: High-ranking bureaucrats read aloud especially complex forms, much to the admiration of lower-ranking Vogons, who attempt to decipher the labyrinthine clauses.

Romantic Life

In typical Vogon fashion, romantic interactions are devoid of affection. Romantic conversations, if they can be called that, often involve sarcastic compliments about bureaucratic efficiency.

Example of Romantic Dialogue:

Vogon A (Krzzrk A):
"Grzzxlzzx urrghkrk fnzzgrpt kaak splurtfnll zzfnrrkn."
Translation: "Your bureaucratic skills are unmatched, and your form-filing technique is impeccable."

Chapter 5: Vogon Punishments

For Vogons, the most effective punishments are bureaucratic. Rather than physical penalties, guilty parties are subjected to form overloads or bureaucratic imprisonment.

Form Overload

The guilty party must submit hundreds of forms in a limited time, each with its own supporting documentation. Any errors result in the entire process restarting.

Bureaucratic Imprisonment

The most severe punishment. The convicted Vogon is confined to a room filled with forms to complete, with new forms arriving faster than they can be filed.

Conclusion

The Vogon way of life is a model of how not to run a civilization. Their language, legal system, reproduction, and social interactions are all dominated by a fanatical adherence to rules and regulations. Vogons have perfected the art of making life as joyless and cumbersome as possible, and they take great pride in that accomplishment.

As the Guide wisely advises, if you ever encounter a Vogon, there’s really only one thing you can do: run. If they don’t get you with their bureaucracy, they’ll surely get you with their poetry.

Wednesday, September 18, 2024

Technical Debt Records (TDRs) and the Tool to Create Them

Introduction 

In today's fast-paced software development landscape, teams face the challenge of continuously delivering new features while maintaining code quality. This often leads to compromises known as **Technical Debt**. To systematically document and manage this debt, **Technical Debt Records (TDRs)** have emerged as a vital tool. This article explores the significance of TDRs, how they benefit developers, architects, and testers, and introduces a tool that simplifies the creation of TDRs.

What are Technical Debt Records (TDRs)?

A **Technical Debt Record (TDR)** is a structured document that captures details about technical debt within a software project. Technical debt arises when short-term solutions are chosen for immediate gains, leading to increased maintenance costs, reduced performance, or other long-term disadvantages. TDRs provide a clear overview of existing technical debt, its impacts, and the measures needed to address it.

Motivation for TDRs

Unmanaged technical debt can accumulate over time, resulting in significant negative consequences:

- **Code Quality:** Increased maintenance efforts and declining code quality.
- **Scalability:** Challenges in scaling and adapting the software.
- **Performance:** Potential performance degradation due to suboptimal implementations.
- **Risk Management:** Elevated risks of system failures or security vulnerabilities.

By systematically documenting technical debt through TDRs, teams can proactively identify, prioritize, and address these issues before they become unmanageable.

Benefits of TDRs for Developers, Architects, and Testers

For Developers:

- **Transparency:** Clear documentation of existing technical debt enhances understanding of the codebase.
- **Prioritization:** Helps focus on critical areas that require immediate attention.
- **Reusability:** Awareness of known issues prevents duplicate efforts in troubleshooting and fixing problems.

For Architects:

- **Strategic Planning:** Assists in planning refactoring efforts and architectural improvements.
- **Risk Assessment:** Evaluates the impact of technical debt on the overall system architecture.
- **Decision-Making:** Provides data-driven insights for making informed decisions about system evolution.

For Testers:

- **Focused Testing:** Knowledge of problematic areas allows for more targeted and effective testing strategies.
- **Enhanced Test Coverage:** Ensures that areas affected by technical debt receive adequate testing attention.
- **Quality Assurance:** Guarantees that resolved debts contribute to overall software quality improvements.

The TDR Template and Its Fields

A well-structured TDR template is crucial for effective documentation of technical debt. The tool we present generates TDRs with the following fields:

1. **Title:** A concise name for the technical debt.
2. **Author:** The individual who identified or is documenting the debt.
3. **Version:** The version of the project or component where the debt exists.
4. **Date:** The date when the debt was identified or recorded.
5. **State:** The current workflow stage of the technical debt (e.g., Identified, Analyzed, Approved, In Progress, Resolved, Closed, Rejected).
6. **Relations:** Links to other related TDRs to establish connections between different debt items.
7. **Summary:** A brief overview explaining the nature and significance of the technical debt.
8. **Context:** Detailed background information, including why the debt was incurred (e.g., time constraints, outdated technologies).
9. **Impact:**
   - **Technical Impact:** How the debt affects system performance, scalability, maintainability, etc.
   - **Business Impact:** The repercussions on business operations, customer satisfaction, risk levels, etc.
10. **Symptoms:** Observable signs indicating the presence of technical debt (e.g., frequent bugs, slow performance).
11. **Severity:** The criticality level of the debt (Critical, High, Medium, Low).
12. **Potential Risks:** Possible adverse outcomes if the debt remains unaddressed (e.g., security vulnerabilities, increased costs).
13. **Proposed Solution:** Recommended actions or strategies to resolve the debt.
14. **Cost of Delay:** Consequences of postponing the resolution of the debt.
15. **Effort to Resolve:** Estimated resources, time, and effort required to address the debt.
16. **Dependencies:** Other tasks, components, or external factors that the resolution of the debt depends on.
17. **Additional Notes:** Any other relevant information or considerations related to the debt.

Rationale for the `State` Field

The `State` field reflects the workflow stages of handling technical debt. It helps track the progress of each debt item and ensures that no debts remain unattended. The defined states are:

- **Identified:** The technical debt has been recognized.
- **Analyzed:** The impact and effort required to address the debt have been assessed.
- **Approved:** The resolution of the technical debt has been approved.
- **In Progress:** Work to resolve the technical debt is underway.
- **Resolved:** The technical debt has been addressed.
- **Closed:** The technical debt record is closed.
- **Rejected:** The resolution of the technical debt has been rejected.

Adjusting Fields Based on State

When initially identifying a technical debt, some fields may remain empty and be filled out as the debt progresses through different states:

- **Initial Identification (`Identified`):**
  - **Filled:** Title, Author, Version, Date, State, Summary, Context.
  - **Empty:** Impact, Symptoms, Severity, Potential Risks, Proposed Solution, Cost of Delay, Effort to Resolve, Dependencies, Additional Notes.

- **Analysis Phase (`Analyzed`):**
  - **Filled:** All fields from `Identified` plus Impact, Symptoms, Severity, Potential Risks.

- **Approval Phase (`Approved`):**
  - **Filled:** All previous fields plus Proposed Solution, Cost of Delay.

- **Implementation Phase (`In Progress`):**
  - **Filled:** All previous fields plus Effort to Resolve, Dependencies.

- **Completion Phase (`Resolved` & `Closed`):**
  - **Filled:** All fields including Additional Notes.

This phased approach ensures that TDRs remain up-to-date and accurately reflect the current status of each technical debt item.

The Tool to Create TDRs

Our **TDR Generator** is a Go-based tool that automates the creation of Technical Debt Records in multiple formats. It supports **Markdown**, **Plain ASCII**, **PDF**, and **Excel**, facilitating integration into various development and documentation workflows.

Features of the TDR Generator

- **User-Friendly:** Interactive prompts guide users through filling out TDR fields.
- **Flexible:** Supports multiple output formats to suit different documentation needs.
- **Automatic Validation:** Ensures input completeness and correctness.
- **Version Control Integration:** Easily check TDRs into systems like Git or SVN.

Repository and Installation

The TDR Generator is available on GitHub. You can access the repository [here](https://github.com/yourusername/technical-debt-generator).

Installation Steps:

1. **Clone the Repository:**

   ```bash
   git clone https://github.com/yourusername/technical-debt-generator.git
   cd technical-debt-generator
   ```

2. **Initialize the Go Module:**

   ```bash
   go mod init technical_debt_generator
   ```

3. **Install Dependencies:**

   The program relies on two external libraries:
   
   - `gofpdf` for PDF generation.
   - `excelize` for Excel file creation.

   Install them using:

   ```bash
   go get github.com/phpdave11/gofpdf
   go get github.com/xuri/excelize/v2
   ```

4. **Save the Program:**

   Create a file named `generate-td.go` and paste the complete program code provided above into it.

Using the TDR Generator

The program can be executed via the command line with various options to customize the output.

Available Options:

- `-format`: Specifies the output format. Supported formats are:
  - `markdown` (default)
  - `ascii`
  - `pdf`
  - `excel`

  **Example:**

  ```bash
  ./generate_td -format pdf
  ```

- `-output`: (Optional) Specifies the output filename. If not provided, a default filename with the appropriate extension is generated based on the selected format.

  **Example:**

  ```bash
  ./generate_td -format markdown -output my_debt_record.md
  ```

- `-empty`: (Optional) If set, the program generates an empty TDR template with placeholders without prompting for input.

  **Example:**

  ```bash
  ./generate_td -format excel -empty
  ```

- `--help` or `-h`: Displays a help message with usage instructions.

  **Example:**

  ```bash
  ./generate_td --help
  ```
Interactive Prompts:

When generating a non-empty TDR, the program will interactively prompt you to enter values for each field, including the new `State` field.

**Sample Interaction:**

```bash
./generate_td -format markdown
```

```
Enter the Title of the Technical Debt: Outdated Authentication Library
Enter the Author of the Document: Jane Doe
Enter the Version (e.g., 1.0.0): 1.2.3
Enter the Date (YYYY-MM-DD) [Leave blank for today]: 

Select the State of the Technical Debt:
  1) Identified
  2) Analyzed
  3) Approved
  4) In Progress
  5) Resolved
  6) Closed
  7) Rejected
Enter the number corresponding to the state: 2

Enter related Technical Debt IDs (leave blank to finish):
 - Related TD ID: TD-101
 - Related TD ID: TD-202
 - Related TD ID: 

Enter Summary: The current authentication library is outdated and poses security risks.
Enter Context: Selected early to meet project deadlines, now incompatible with new security standards.
Enter Technical Impact: Incompatibility with the latest framework version.
Enter Business Impact: Increased risk of security breaches affecting customer trust.
Enter Symptoms: Frequent security audit failures and increased bug reports.
Enter Severity (Critical / High / Medium / Low): High
Enter Potential Risks: Data breaches, legal liabilities, and loss of customer trust.
Enter Proposed Solution: Replace the outdated library with a modern, well-supported alternative.
Enter Cost of Delay: Each month of delay increases security vulnerabilities and complicates future upgrades.
Enter Effort to Resolve: Approximately 6 weeks for two developers.
Enter Dependencies: Completion of the ongoing security audit.
Enter Additional Notes: Coordination with the operations team for seamless integration.

Technical Debt record has been saved to 'technical_debt_record.md'.
```

Output Files:

Depending on the selected format, the program generates the TDR in the specified format:

- **Markdown (`.md`):** Structured and readable documentation suitable for version control and collaborative editing.
- **Plain ASCII (`.txt`):** Simple text format for basic documentation needs.
- **PDF (`.pdf`):** Portable Document Format for sharing and printing.
- **Excel (`.xlsx`):** Spreadsheet format for data analysis and integration with other tools.

Best Practices

Version Control Integration

**Technical Debt Records (TDRs)** are valuable documents that should be maintained alongside your codebase. To ensure that TDRs are effectively tracked and managed, consider the following best practices:

1. **Check TDRs into Version Control:**

   - **Git:** Commit TDRs to your Git repository alongside your code. This approach ensures that TDRs are versioned and can be reviewed, branched, and merged similarly to your source code.
     
     **Example:**
     ```bash
     git add technical_debt_record.md
     git commit -m "Add TDR for Outdated Authentication Library"
     git push origin main
     ```

   - **SVN:** Similarly, commit TDRs to your SVN repository to maintain version history and collaboration.

2. **Organize TDRs:**

   - **Directory Structure:** Maintain a dedicated directory (e.g., `/docs/tdrs/`) within your repository to store all TDRs. This organization facilitates easy navigation and management.
   
   - **Naming Conventions:** Use clear and consistent naming conventions for TDR files, such as `TDR-<ID>-<Title>.<extension>`. For example, `TDR-101-Outdated-Auth-Library.md`.

3. **Link TDRs with Issues or ADRs:**

   - **Issue Tracking Integration:** Reference TDRs in your issue tracker (e.g., Jira, GitHub Issues) to provide context and track resolution progress.
   
   - **Architecture Decision Records (ADRs):** Link related ADRs to TDRs to maintain a comprehensive documentation trail of architectural decisions and their technical debt implications.

4. **Regular Review and Updates:**

   - **Periodic Audits:** Schedule regular reviews of TDRs to assess their current state, prioritize resolutions, and update statuses as work progresses.
   
   - **Continuous Improvement:** Encourage team members to document new technical debt promptly and update existing TDRs to reflect any changes.

5. **Access Control:**

   - **Permissions:** Ensure that only authorized team members can create, modify, or delete TDRs to maintain data integrity and accountability.
   
   - **Collaboration:** Use version control features like pull requests or merge requests to facilitate collaborative reviews and approvals of TDRs.

Conclusion

**Technical Debt Records (TDRs)** are an indispensable tool for managing technical debt in software projects. They provide transparency, facilitate prioritization, and support strategic decisions to enhance code quality and system architecture. The introduced **TDR Generator** simplifies the creation of these essential documents and integrates seamlessly into existing development and version control workflows.

By consistently utilizing TDRs and integrating them into your version control systems like Git or SVN, teams can effectively manage technical debt, ensuring the long-term health and maintainability of their software projects.






Source: https://github.com/ms1963/TechnicalDebtRecords/tree/main

Saturday, July 13, 2024

AI is not about Intelligence

 I am now working on AI topics for several years. As you all know, the current enthusiasm is bot mind-blowing and terrifying.  Laymen often tell me what they think AI is all about. In most cases, they assume AI algorithms, in particular LLMs, are smart in a human sense.

No, they are not smart like humans. Their behavior makes us think they are. I can understand why people are surprised whenever  LLMs provide some elaborate and sophisticated answers.  In reality all of their replies are based on statistics and giant sets of training data. 

It is the same for artificial neural networks (ANN). An ANN is trained with large datasets  in a process called supervised learning.  The outcome of each inference is a probability function. If you teach a CNN network how a cat or dog looks like, it will find some commonalities respectively patterns of each class (such as dog or cat). Given a picute it has not seen before  it just estimates how close the subject in this picture resembles a cat, a dog or anything else. When you feed it with a picture of a cat, it'll only be able to respond that this could be a cat with a probability of 91.65%.

The same holds for transformer models (encoders, decoders) in LLMs. They are trained with a huge amount of documents to create embeddings. These are just vectors that describe in which context a specific fragment is typically being used. To create answers, LLM implementations need to understand the meaning of the prompt,and eventually to create a reply, word by word, where each succeeding word is determined by a probability function. Actually this actual process is much more complex and sophisticated, but the principle remains the same.

What is missing in AI to call them smart in a human sense?

  • Lack of proactivity: AI algorithms only react to input. They are not capable of proactive behavior.
  • No consciousness: They have no consciousness and cannot reflect on themselves. 
  • Lack of free will: This is a consequence of AI lacking proactivity and consciousness. An AI provides answers but makes no decisions.
  • No emotions: AIs can recognize the emotions of humans, for example, by performing a sentiment analysis or by observing gestures. However, they cannot experience their own emotions such as feeling empathy.
  • Learning from Failure: AI is not able to learn from its own errors. And obviously there is no way to interactively teach an AI about its mistakes so that it can dynamically adapt. Errors or biases can only be eliminated by changing the training data or the algorithms which at the end of the day results in a new AI.
  • Constraints: An AI is constrained by the input it receives. It is not able to observe its enviroment outside of its cage.
  • Fear of death: An AI does not care about whether it lives or not. This might sound rather philosophical but is a valid aspect, given the way intelligent life behaves. 
Unfortunately, the Turing test is not able to decide whether an AI is intelligent. It can only figure out whether an AI seems to be intelligent. 

What do you think? How could an appropriate test look like?


Tuesday, September 05, 2023

The Dark Side of Crowdfunding

This post is not going to cover any software architecture topic. Instead I want to share some impressions and experiences with crowdfunding platforms such as Indiegogo or Kickstarter.


Let me start with a success story: Bambu Lab was completely unknown when the upcoming 3D printer company started their X1/X1C campaign via Kickstarter. They eventually gathered almost 55 million HK-$ from 5575 backers. In the following months Bambu Lab completed the X1/X1C product line and sent all the perks to the backers. This new CoreXY 3D printer turned out to be a revolutionary, award-winning and extremely successful product which soon was followed by other products like the P1P and the P1S. Needless to say that Bambu Lab has been a huge success story with a happy end  for the crowdfunding company, the campaign owner, and the backers. 


One of the benefits of crowdfunding can be summarized as: crowdfunding platforms connect innovative campaigners and enthusiastic backers. They enable start-up companies and well established companies to get funding for innovative products.


In hindsight, not all campaigns work that well. In some cases, campaigners fail to provide a product, only create an under average product, run out of money, or turn out to be scams. Year by year millions of US-$ get lost this way. It is never foreseeable whether a project will succeed, as it is the case with joint ventures. Reasons for failure might be infeasibility of the innovation, budget overspending, huge project delays caused by unfortunate conditions such as Covid-19, underestimation of costs, or sharp increases of prices for necessary components.


While project failure can never be avoided, scams can. A chinese campaign owner collected over one million US-$ in his Indiegogo campaign featuring the world‘s smallest Mini-PC, but did not create any of the promised perks. After a while there would be even no communication between campaign backers and the campaign owner. It seemed as if the owner just had disappeared from the surface. When backers asked Indiegogo for help, the crowdfunding company did not feel responsible. They just disabled any further contributions, put a „this campaign is currently under investigation“-label on the project web site, but did never provide any results of the so-called investigation nor a refund to betrayed customers.


Lesson 1: crowdfunding companies do not care (too much) about backers. They earn money by providing a platform for different parties, treat backers as venture capitalists who are supposed to bear all the risks themselves.


Indiegogo, Kickstarter basically act like betting offices for horse races with almost no transpareny about the horse owners (aka campaign owners). Every participant in such scenarios bears high risks with the betting company being the only exception. Obviously, the rules between customers and the crowdfunding platform are defined in such a way that the bank (aka betting office) will always win.


Lesson 2: if you are contributing to a crowdfunding campaign, make sure, you can live with project failure and with complete loss of your contributions.


Every backer should be aware of this reality. She/he may lose her/his whole contribution or get an overpaid or even useless perk. Sure, the majority of campaigns does eventually succeed. However, there is also a significant amount of campaigns that fail. I do not bother about project failure despite of huge efforts of campaign owners. This is a known and acceptable risk backers should keep in mind when contributing. But I bother about scam campaigns where owners just take the collected contributions and vanish.


Lesson 3: If you urgently need a specific type of product, don‘t contribute to a crowdfunding campaign, but buy it from well-established sources instead.


Lesson 4: Currently, no safety nets for backers exist. Neither is there any transparency or accountability with respect to campaign owners. A campaign resembles a game or a bet on the future without sufficient transparency regarding campaign owners. 


Lesson 5:  Do not believe in videos and documents provided by campaigners.  Consider this information as a pure marketing and advertising campaign. Never trust any promises, in particular not those that seem to be unrealistic or very, very challenging to fulfil. Phrases like „the world‘s first“, „the world’s fastest“ or „the world’s smallest“ should make backers sceptical.


What could be done to avoid such situations? Or is the crowdfunding platform inherently unable to protect backers?


In fact, there should be a kind of trust relationship between all players in the game - yes, it is a game! To achieve the right level of trust, a crowdfunding company shall offer the following services:

  • Personal identification of all campaign owners with official and legal documents such as passports, driver licenses, locations of residence. This enables companies like Indiegogo or Kickstarter to keep in touch with campaign owners and track them down. Sure, passports and the like can be faked as well, but this requires a substantial amount of criminal energy.
  • Transparency: If we analyze existing campaigns, lack of transparency is one of the biggest issues. By „lack of transparency“ I am referring to the fact that backers often know almost nothing about campaign owners. This is related to the previous aspect. While backers need to guarantee with credit card payments that they are trustworthy (which is checked by the credit card companies), they only get a tiny amount of  information about campaign owners in return. Wait a minute. I am paying my contribution to people that are mostly anonymous (i.e. hiding behind a campaign web site)? Unfortunately, the answer is yes. It does not suffice when only the crowdfunding company owns detailed information about the campaign owners.
  • Due diligence measures would require a crowdfunding company to technically check whether a campaign respectively project is feasible. For this purpose, they may hire experts in the respective domain to validate the claims campaign owners make. In addition, they should check the background of campaign owners, be it companies or individuals.  If a successful company such as Anker acts as the campaign owner, there is a much higher chance that contributors will receive the offered perks and rewards. If on the other hand the campaign originator is unknown, the risk is significantly higher. Accountability should come to one’s mind when thinking about campaigns and their originators.
  • Check and balances: step-wise transfer of contributions instead of full payment at once. This may be a bit difficult to achieve, because certainly some upfront investments are required by campaign owners. Nonetheless, I’d expect more of a bank (crowdfunding platform)/borrower (campaign owner) attitude in this context. In each step (such as prototyping, testing, final product design, manufacturing, delivery) the crowdfunding company should demand proofs by the campaign owners what they did and achieve so far with the crowdfunding investments. For example, prototyping only requires a smaller amount of money. After coming up with  a successful prototype, they may move forward to completing the product. After the product is ready, they move further to manufactoring.  In each step they obtain predefined percentages of funding. In addition, campaign owners are supposed to provide a concrete time line for all of their activities. If a step is delayed, no further money can be obtained until the step is completed. A kind of traffic light on the project web site could represent the current risk level of a campaign.
  • Shipment: for each project campaign owners need to prove that they actually shipped the perks and rewards to their backers by presenting respective documents from the delivery service. In my experience, some campaign owners marked the perks as being shipped without ever actually sending any items.
  • Insurance: Crowdfunding companies should pay a part of each contribution to an insurance company that covers all risks and pays back a high percentage of the contribution to backers. This is similar to how Paypal works. It would require campaign originators to disclose personal information which can then be rated in terms of credibility, credit history, financial background, and trustability. This puts more burden to the campaign owners and the crowdfunding company, and makes contributions more expensive, but provides a safety net for backers which are those who pay campaign owners and crowdfunding platforms, anyway. I assume, many backers would be willing to pay a slightly higher contribution if they win more security in return. Of course, crowdfunding platforms could act as insurances themselves if they are willing to do so.
  • No selling on other channels: In some campaigns the perk developers started selling their products via their web sites before some backers even received their perks. The contract between campaigners and crowdsourcing plaforms should definitely exclude this possibility. Whenever backers spend funding to product development via a crowdfunding campaign, they must be the first who receive their perks and rewards. In addition, some of the products sold were significantly cheaper than the claimed MSRP. This looks like betrayal, smells like betrayal and is a betrayal.  In such cases I‘d expect campaign owners to have to pay penalties to backers.

Some may argue that all of these measures restrict the freedom of campaign owners. They are right in this respect. However, there currently is an imbalance between contributors, campaign originators, and crowdfunding platforms which puts most risks on the backers. Thus, it seems more than fair to share these risks among all stakeholders. I honestly believe, that crowdfunding evolves to a dead end, if companies like Indiegogo continue to put all burdens to backers, don‘t care much about scams, refuse to create safety nets, or keep the high intransparency. If they realize all or at least some of the aforementioned measures, this clearly will turn out to be more of a Win/Win/Win scenario. 

Sunday, August 20, 2023

AI and Software Architecture - Two Sides of the same Coin

Introduction

The whole media worldwide is currently jumping on the AI bandwagon. In particular, Large Language Models (LLM) such as ChatGPT sound appealing and intimidating at the same time. When we dive deeper into the technology behind AI, it doesn‘t feel that strange at all. In contrast to some assumptions of the yellow press, we are far away from a strong AI that resembles human intelligence. This means, blockbusters such as Terminator or Bladerunner are not becoming true in the near future. 

Current AI applications, while very impressive, represent instantiations of weak AI.  Take object detection as an example, where a neural network learns to figure out what is depicted on an image. Is it a cat, a dog, a rabbit, a human or something different? Eventually, neural networks process training data to compute and learn a nonlinear mathematical function that works incredibly well for making good guesses (aka hypotheses) with high precision about new data. 

On the other side, this capability proves to be very handy when dealing with big or unstructured data such as images, videos, audio streams, time series data, or Kafka streams. For example, autonomous driving systems strongly depend on such kind of functionality, because they continuously need to analyze, understand and handle highly dynamic traffic contexts, e.g., potential obstacles.

In this article, I am not going to explain the different kinds of AI algorithms such as types of artificial neural networks and ML (Machine Learning) which may be part of a subsequent article. My goal is to draw the landcape of AI with respect to software architecture & design.


There are obviously two ways of applying AI technologies to software architecture:

  • One way is to let AI algorithms support software architects and designers in their tasks such as requirements engineering, architecture design, implementation or testing - which I’ll call the AI solution domain perspective.
  • The other way is the use of AI to solve specific problems in the problem domain, why I’ll name it the AI application domain perspective.


AI for the Solution Domain

LLMs are probably the most promising approach when we consider the solution domain. Tools such as GitHub Copilot, Meta Llama 2 and Amazon CodeWhisperer help developers generate functionality in their preferred programming language. It seems like magic but comes with a few downsides. For example, you never can be sure whether an LLM learned its code suggestions from copyrighted sources. Nor do you have any guarantee that the code does the right thing in the right way. Any software engineer who leverages an application like Copilot needs to look over the generated code again and again to ensure the code is exactly what she or he expects. It requires software engineering experts to continuously analyze and check LLM answers. At least currently, it appears rather unlikely that laymen may take over the jobs of professional engineers with the help of LLMs. 


Companies already have began to create their own LLMs to cover problem domains such as industrial automation. Imagine, you need to develop programs for a PLC (Programmable Logic Control). In such environments, the main languages are not C++, Python or Java. Instead you’ll have to deal with domain-specific languages such as ST (Structured Text = Siemens SCL) or LD (Ladder diagram). Since there is much less source code freely available for PLCs, feeding an LLM with appropriate code examples turns out to be challenging. Nonetheless, it is a feasible objective. 


AI for the Application Domain

In many cases Artificial Neural Networks (ANNs) are the basic ingredient for solving problem domain challenges. Take logistics as an example where cameras and ANNs help identity which product is in front of a camera. Other AI algorithms such as SVNs (Support Vector Machines) enable testing equipment to figure out whether a turbine is behaving according to its specification or not, which is commonly coined Anomaly Detection. At Siemens we have used Bayes-Trees to forecast the possible outcome of system testing. Reinforcement Learning happens to be useful for successfully moving and acting in an environment, for example robots learning how to  complete a task successfully. Another approach is unsupervised learning such as k-Means Clustering which classifies objects and maps them to different categories. 


Even more examples exist:

Think about security measures in a system that comprise keyword and face recognition. Autonomous driving uses object detection and segmentation in addition to other means. Smart sensors include ANNs for smell and gas detection. AI for preventive maintenance helps analyzing whether a machine might fail in the near future based on historical data. With the help of recommender systems online shops can provide recommendations to customers based on their order history and product catalog searches. As always, this is only the tip of the iceberg.


Software Architecture and AI

An important topic seldomly addressed in AI literature is how to integrate AI in a software-intensive system. 


MLOps tools support different roles like developers, architects, operators and  data analysts. Data analysts start with a data collection activity. They may augment the data, apply feature extraction as well as regularization and normalization measures, and select the right AI model which is supposed to  learn how to achieve a specific goal using the data collection. In the subsequent step they test the AI/ML-model with sufficient test data, i.e. data the model has not seen before. Eventually, they version the model & data and generate an implementation. Needless to say that data analysts typically iterate through these steps several times. When MLOps tools such as Edge Impulse follow a No/Low-Code approach, separation of concerns between different roles can be easily achieved. While data analysts are responsible for the design of the AI model, software engineers can focus on the integration of the AI model in the system design process, as the MLOps envoronment generates implementation of the model.


Software engineers take the implementation and integrate it into the surrounding application context. For example, the model must be fed with new data by the application which reads and processes the results once inference is completed. For this purpose, an event-driven design often turns out to be appropriate, especially when the inference runs on a remote embedded system. If the inference results are critical, resilience might be increased by replicating the same inference engine multiple times in the system. Docker containers and Kubernetes are perfect solutions, in particular when customers desire a scalable and platform-independent architecture with high separation of concerns like in a Microservice architecture. Security measures support privacy, confidentiality, and integrity of input data, inference results and the model itself. In most cases, inference can be treated from a software engineering viewpoint mostly as a black box that expects some input and produces some output. 

When dealing with distributed systems or IoT systems, it may be beneficial to execute inference close to the sources of input data, thus eliminating the need to send around big chunks of data, e.g., sensor data. Even embedded systems like edge or IoT nodes are capable of running inference engines efficiently. In this context, only inference results are often sent to backend servers.


Operators finally deploy the application components onto the physical hardware. Note: a DevOps culture turns out to be even more valuable in an AI context, because more roles are involved.


Input sources may be distributed across the network, but may also comprise local sensor data of an embedded system. In the former case, either Kafka streams or MQTT messages can be appropriate choices to handle the aggregation of necessary input data on behalf of an inference engine. Take processing of weather data as an example where a central system collects data from various weather stations to forecast the weather in a whole region. In this context we might encounter pipelines of AI inference engines, where the results of different inference engines are fed to a central inference engine. Hence, such scenarios comprise hierarchies of possibly distributed inference engines.


Architecting AI models

Neural networks or other types of AI algorithms expose an architecture themselves, be it a MobileNet model leveraged for transfer learning, a SVN (Support Vector Machines) with a Gaussian kernel, or a Bayes decision tree. The choice of an adequate model has significant impact on the results of AI processing. It requires the selection of an appropriate model and hyperparameters such as learning rate or configuration of layers in an ANN (Artificial Neural Network). For data analysts or those software engineers who wear a data analytics hat a mere black box view is not sufficient. Instead they need a white box view to design respectively configure the appropriate AI model. This task depends on the experience of data analysts, but may also imply a trial-and-error approach for configuring and fine tuning the model. The whole design process for AI models closely resembles software architecture design. It consists of engineering the requirements (goals) of the AI constituents, selecting the right model and training data, testing the implemented AI algorithm, and deploying it. Consequently, we may consider these tasks as the design of a software subsystem or component. If an aforementioned MLOps tool is available und used, it significally can boost design efficiency.


Conclusions

While the math behind AI models may appear challenging, the concepts and usage are pretty straightforward. Their design and configuration is an important responsibility that experts in Data Analytics and AI should take care of. MLOps helps separate different roles and responsibilities which is why I consider its use as an important development efficiency booster. 

Architecting an appropriate model is far from being simple, but resembles the process of software design. Training an AI model for ML (Machine Learning) may take weeks or months. As it is a time consuming process, the availability of performant servers is inevitable. Specialized hardware such as Nvidia GPUs or other dedicated NPUs/TPUs helps reduce the training time significantly. In contrast to the amount of required training efforts, optimised inference engines (-> Tensorflow Lite or Lite Micro) often run well and efficient on resource constrained embedded systems which is the concept behind AIoT (AI plus IoT).






Saturday, April 29, 2023

 Systematic Re-use

Re-use is based upon one of the fundamental principles not only for lazy software engineers: DRY (Don‘t Repeat Yourself). Instead of reinventing the wheel developers and architects may re-use existing artifacts instead of reinventing the wheel again and again. 

Re-usable assets come in different flavors:

  • Code snippets are small building units developers may integrate in their code base. 
  • Patterns are smart and proven design blueprints that solve recurring problems in specific contexts.
  • Libraries comprise encapsulated functionality developers may bind to their own functionality.
  • Frameworks also comprise encapsulated functionality. In contrast to libraries developrs integrate their own code into the framework according to the Hollywood principle (don‘t call us, we‘ll call you).
  • Components/Services include binary functionality (i.e., they are executables) that developrs may call from their own application.
  • Containers represent runtime environments that provide functionality and environments to applications in an isolated way.
Apparently, these are different levels of re-usable assets with varying granularities, complexities, and prerequisites.

Software engineers may not only use re-usable software assets, but other types as well. For instance:
  • Tests, Test units, Test plans
  • Documents
  • Production plans
  • Configurations
  • Business plans
  • Software architectures
  • Tools
While some assets such as code snippets may be used daily in the code-base, patterns or software architecture templates need to be instantiated in an easy way. 
The more impact re-usable assets have on applications and the more abstract they are, the more systematic the re-use approach must be. The most challenging projects are product lines and ecosystems that require different assets at different re-use levels. For example, they introduce the need for a configurable core asset base that is re-usable across different applications. Furthermore, they support a whole class of applications that share the same architecture framework and other assets. A core asset in a product line or ecosystem affects not one application but a whole system family.  Thus, its business impact is very high. 
In such scenarios, core assets often are inter-dependent and must be configured for the specific application under development.  As a prerequisite for the development of a core asset base, a Commonality/Variability analysis is necessary that determines what applications sharing the same core assets have in common and how they differ. A core asset needs a common base relevant for all applications that use it as well as configurable variation points to adapt it to the needs of an application. 
A bad or insufficient  Commonality/Variability analysis incurs higher costs a may even lead to project failure. 
Core asset development and application development might happen separately by different teams or  by the same teams. Each approach has its benefits and liabilities.
Due to the high business and technical risks of these advanced approaches, all stakeholders need to be involved in the whole development process. Building a product line or ecosystem without management is not feasible. Managers need to re-organize their organisation, spend budget for core asset development and evolution. 
Most product lines and ecosystems fail, because:
  • lack of management support,
  • insufficient consideration of customer needs,
  • inappropriate organisation,
  • inadequate Commonality/Variability analysis,
  • insuffient or low-quality core assets,
  • underestimation of testing or inadequate quality assurance,
  • bad software architecture,
  • neglectence of competence ramp-up activities,
  • no re-use incentives,
  • missing acceptance by stakeholders.
Consequently, product lines and ecosystems need a systematic approach for re-use  and must involve  different types of stakeholders. They need a manager  who is able to guide the approach and has the capability to decide, for example, on budget, organisation restructuring, competence ramp-up activities, or business strategy. 

Re-use comes in different flavors and the higher its impact the more systematic the re-use process needs to be

[to be continued]




Sunday, March 26, 2023

 

Models and Modelling - A Philosophical Deep Dive

Motivation

Not only in software architecture we use models for designing and documenting systems. Models are also indispensible in other engineering disciplines and in natural sciences. We all experienced good and bad models in our daily lifes. What is a model really about? And how does a good model look like? Let us enter a (philosophical) discussion about this topic.


What is a model?

A model captures the essence of a domain. It focuses on the core entities and the relationships within a domain from a specific viewpoint, i.e., serving a specific purpose. A model contains rules that must hold for its constituents. Models are used by humans or machines to communicate about the respective domain for a particular purpose.


Examples of models include:

a UML diagram

a street map

a floor plan

an electronic circuit diagram

a problem domain model (DDD)

quantum theory

mathematical formulas


Consequences: 


(i) The same domain can be represented using different models, each capturing another viewpoint of that domain. This viewpoints are often briefly called views.


(ii) Models can be informal or formal depending on their usage as a means for communication. Thus, they must be easily understandable and comprehensible by stakeholders.


(iii) Models introduce abstraction layers by using generalization and specialization leaving out „unnecessary“ respectively irrelevant details. 


(iv)  A model does not describe reality but a subset of reality viewed from a specific angle.


(v) Languages are based upon models. A model can be viewed as a language, and vice versa. 


(vi) A model may support a graphical presentation or a textual presentation, it even may include both.


The complexity of a model is directly proportional 

  • to the number and types of its entities and their relationships,
  • to the kinds and numbers of abstractions being used,
  • to the complexity of its underlying rules.


A good model:

  • provides a proper separation of concerns (SoC)
  • consequently applies principles such as the single responsibility principle (SRP), Don’t-Repeat Yourself (DRY), KiSS, or the Liskov Substitution Principle (LSP) in order to gain the highest understandability and comprehensiveness
  • uses expressive names for all its abstractions, entities, dependencies
  • provides an effective and efficient means of communicating among stakeholders
  • focuses on essence and leaves out everything that does not serve the required purpose of the addressed viewpoint
  • avoids accidental complexity strictly and consequently 
  • allows to model simple things in a simple way, while being capable of expressing complex things in a doable way


Stakeholders

The creation of a model should be guided by its (types of) stakeholders, in particular by the way they intend to use the model. In this context a meta model helps define how the set of creatable models should look like. Thus, meta models constitute modeling languages. They help create different models or views.


To define an adequate model that serves an intended purpose all (human) stakeholders should be involved. UML is an example of a modelling language that serves the needs of software engineers but (often) not those of many domain experts. In fact, domain experts might have their own models readily available. While a model might be perfect for machine-machine communication, it aint’t necessarily adequate whenever humans are involved. The more formal a model is, the easier it can be processed by computers. Humans often need more informal and expressive models instead. If both kinds of stakeholders are involved, we need to balance between formal and informal approaches. 

Emojis are an example of an informal model. They can be immediately understood by a human, but may be more difficult to process by a machine.

Artificial Neural Networks albeit “simple” can be processed by machines very well, but are hard to be understood by a human - i.e., with respect to what they actually do and how they work. 

UML is somewhere in the middle of these extremes. 


Fortunately in many mature domains, models already exist. An electrical circuit defines a proven concept of a model. Mathematics is often considered a uniquitous language with predefined notations. In the context of software engineering, domain models are often implicitly defined and have been established as common sense in an organization. If software engineers with no or little domain expertise start to develop software applications for the respective domain, they need to make the implicit model explicit. Otherwise they cannot design a software architecture that meets the customer requirements. This is what DDD (Domain-Driven Design) is all about. It tries to come up with a domain-specific model using generic building blocks such as DDD patterns and techniques.


The representation of a model should fit the needs of its stakeholders. For humans graphical notations often work very well, because they explicily reveal their structure in an easy manner and are good to grasp and to handle. Due  to productivity reasons, textual models may be more beneficial and flexible in some cases. As an example consider software code. For a beginner graphical code blocks might work very well, while advanced programmers prefer coding textually, because they can mentally map seamlessly between the “graphical” design and the textual code representation. Handling code graphically might just reduce their productivity, effectiveness and flexibility due to all clutter and constraints.


Model Transformations

To keep many stakeholders satisfied a possible approach might be to introduce different models for different types of stakeholders and also create mappings between these models, for example an easy to understand UML model that is transformed into a machine readable XML schema. 

Actually, software engineers are used to handle different models that are mapped onto each other. In software engineering a compiler represents a model transformation from a high level language to a system language or interpreter. A UML diagram might be transformed into high level language code. A low-code/no-code environment creates domain-specific applications from high level user specifications. However, model-to-model transformations can be quite complex, in particular when the gap between models is very large and if no common-off-the-shelve solutions for the transformations are available. Moreover, the more models the more transformations are necessary. Note: a model transformation might also be done manually if the model is not too complex and mapping rules are pretty straightforward.


Model sets

In domains such as building contruction or software engineering multiple views are necessary to represent information from different angles. Take design view, deployment view or runtime view as examples in the software engineering domain. In addition, their might be different model abstraction layers, for example, an in-depth design view versus a high level software architecture view.  In other words, to solve a task we need a model set instead of a single model that captures every detail from every perspective.

No matter how the views differ from each other, there needs to be meta information to tie the different views together. Prominent examples are the mapping from a view to code, and the implicit or explicit relation of views with each other. Note: there might be different solutions respectively model kits for the same problem context, e.g., RUP’s 4+1 view in contrast to TOGAF might not be the (only) solution of choice for designing an enterpise system. 

No matter what model set you choose, make sure that it is used consistently. In most cases tool support is strongly recommendable. Models can become very complex. Therefore you need a tool to draw, check and communicate the concrete models. This is the main reason why most software engineering activities rely on some sort of UML environment such as Enterprise Architect or MagicDraw. 

A ground plan is different from an electricity plan. All models together are necessary for building construction.  In this example, there might  also be rules and constraints across all models respectively views. For example, an electrical cable should have a minimum distance to a water pipe. Consequently, we need some kind of verification algorithm to check whether rules/constraints are violated. 


Model Creation

Models shall never be created in a big bang approach. They are living entities that change over time the more experience you obtain. They may start very simple but become more complex over time. Whenever they are overengineered, they need to be simplified/refactored again. Model creaters need to ensure that models can be facilitated and handled by stakeholders easily. If stakeholders have different viewpoints at the same problem, create a model set where each model view serves a particular set of stakeholders.


To start creating a model for a domain context, we should figure out whether such models already exist, and if this is the case, whether these models can serve the desired purpose(s). It is always beneficial to use existing models, in particular due to the experience and knowledge they carry. So, don‘t reinvent the wheel if not absolutely necessary, especially if you are no expert in the domain.


If no model exists, stakeholders should jointly create a model (set). It is helpful if at least one of the stakeholders is experienced in creating models while at least some other person is a domain expert.


If models exist that do not serve the intended purpose, we might change and adapt these models to fit our needs.


Note: a common mistake is to first focus on the syntax of a model. Instead, initially think about its semantics and find a good syntactical representation afterwards.  


No matter how a new model is created,  learning its representation should happen in a quick and straightforward process, even for unexperienced stakeholders.


Interestingly, most graphical models consist of rectangular or other symmetrical shapes, arrows, lines and textboxes, while textual models often use regular or context-free grammars. The reason for this observation is that this way the models are comprehensible and their handling is easy. It should also be possible to draw a model manually in order to discuss it with other stakeholders before documenting it.  Sitting around a modelling tool significantly decreases productivity, at least in my experience. A whiteboard or a flipboard is by far the best tool for modelling. This can be complimented by an AI software that recognizes manually drawn models and transforms them to clean and processable data representations. 


Summary

In this blog posting I did not reveal any new or innovative stuff you didn‘t already know. Neither was it my intent to provide anything revolutionary. It is just a summary of modelling and how to approach it. And if you started thinking about modelling from this more philosophical view, I‘d be happy.