Hitchhiker's Guide to AI, Software Architecture, and Everything Else: Small is better

Monday, January 03, 2011

Small is better - really?

In different panels and discussions I often hear programming languages such as Scala are not a good idea due to their "language size". Less syntax is better to learn a language which is why, for example, Clojure is superior to Scala. Often, the proponents of this argument point to languages such as C++, C# or Java as negative examples. Resembles software architecture a lot, doesn't it? Honestly, I object to this conclusion. Take languages such as brainf*ckr or the Turing Machine. Very simple syntax indeed, but would you consider them powerful languages? Definitely not! In my viewpoint it is not the size that matters but the noise-to-signal-ratio. If you need to express a solution for a given problem, how much efforts do you need to find and articulate the solution using the language idioms. The more appropriate and concise idioms are available that support your solution, the better. As in software architecture some qualities are important when judging languages:

appropriateness to solve a specific class of problems
simplicity
expressiveness
orthogonality of language structures
approach of least surprise
emergence of features
avoiding implcit dependencies and influences (such as temporaries in C++)
availability of powerful language idioms

By the way, these qualities are also applicable to software architectures. This is no surprise because we are speaking about architecture of programming languages here.

There are several conclusions we could draw from that qualities:

Even if a language offers most of these qualities at the beginning, it might erode over time when it is evolved in an unsystematic or improper way
Several problem classes might imply various languages. So, we have to choice between a best of breed approach for each problem class as a polyglot programmer would prefer, or we could try to indentify some multi-paradigm language. Note, however, that multi-paradigm languages are those which are often very likely to erode, especially when more and more paradigms are added as an afterthought. However, this is a risk, not a predefined fact. For instance, multiple paradigms were cleanly integrated in Scala from day one, while C++ started as a structured language with additional classes.
The language core might be excellent but it won't be helpful if the support libraries and tools do not reveal the same qualities mentioned above. A fool with a tool is still a fool. So, even for the best languages, APIs of libraries need to integrate tightly.
If you follow the Pragmatic Programmers advice to learn a new language every year, you will also be able to extend your knowledge about new solution paradigms and idioms. This will drastically improve your skills as programmer and architect. Mind the gap: it is not sufficient to just learn a language, you also need to practice it for a while.

Unfortunately, using multiple languages at the same time might be a bad idea. I experienced a project where people invented their own language. After a while, only one expert was left who did understand the language which was not that comfortable for the organization. If you plan to use multiple languages, think about the problems at hand. If they can be better solved using multiple languages, go for it, but make sure, several people in your team know these languages. If possible, use languages that run on either the DLR/CLR or the JVM, because then they will share the same SDKs and (often) even the same tools. Make yourself aware that you ARE already using several languages such as HTML5, XML, SQL, DSLs and so forth. We already have become polyglot programmers. But that doesn't mean you should strive for large numbers of languages in your projects.

Personally, I have learned a lot of languages in my career: x86-ASM, Pascal, Modula2, VB, Java, C, C++, Ruby, Lisp, Clojure, Scala, C#, F#, CIP, Haskell, D, Axum, ... All of these languages have their purpose and their strengths, but also their weaknesses and problem areas they cannot address well. Some of them are large and easy to learn such as Scala, while others are small and difficult to learn such as Lisp which is why they invented Scheme :-)

Size does not necessarily matter, the problem you are going to solve does.

2 comments:

Robert said...: Hi,
when discussing the quality of languages the main problem is most often the original question. You titled you post "Small is better - really" - so first of all you would have to define whats "better" at all. I personally think a language is good, when an average team can implement software at low cost.
I would agree to „Small is better“ in a way, that a language should guide you to write clean code. This is a main problem of C, C++ or even PHP. It is so simple to write really ugly code with those langugages. In JAVA it feels much worse to write bad code. Whats worth a powerful language with complex idioms and only 10% of your development staff really understand whats going on? Sounds like a high truck factor.
I try to always think of languages as they would be tools for craftsman. It is important, that the tools are of high quality, easy to use and they have to fit to the problem itself. And not to forget – they must fit in a complete toolset . For us the toolset is SDK, Class Library, IDE, Test-tools. And as a company I want to use a toolset where I can find a lot of people able working with. So this is one of the most important arguments for the mainstream languages.
Sometimes I feel that highly educated software architects and high performance developers want to make themselves unreplacable when introducing languages and tools into teams, which only can be handled and understood well by themselves. An average bachelor education at university is nowdays not able anymore to teach all concepts we are using in typical projects. I think this is one of the main reasons why should try to reduce complexity and polyglott programming does not really help to make things less complex.; 2:29 PM
Michael said...: You really addressed some valid points.
Whenever we can avoid polyglot programming, we should do it. In terms of languages my experience shows that all languages that create implicit side effects such as C/C++ make it difficult to read and understand code. You can't understand a C++ program without knowing about temporaries and many other compiler/runtime-related issues. This is where flexibility and complexity go hand in hand.
Java and C# are excellent languages. However, I remember when I started using Java 15 years ago, many colleages asked me why we should use Java as we already got Smalltalk and C++. Basically, they brought up similar arguments like many who don't feel so passionate about functional languages nowadays. I also remember the time when I was using OO and considered like an alient, because they claimed, OO was so amazingly complex.
The problem is that languages are subject to a lifecycle. They are created, evolve until they reach their limits, and are then made obsolete when new languages appear that advance productivity. My fear is that Java and C# may reach this point in a few years. You simply cannot add features to languages endlessly without turning them into overcomplex beasts. Thus, the Hammer-and-Nail syndrome is also applicable to languages. I know that learning and using a new language implies a lot of investments such as competence-ramp-up, new tools, new APIs, etc. Fortunately, all the languages today fit perfectly into existing tool infrastructures. Take Visual Studio, Eclipse, Netbeans, IntelliJ IDEA as examples. Using Clojure or Scala isn't that big step anymore in terms of tools or even APIs. Same for F# and Visual Studio.
On the other hand always striving for the most up-to-date languages or technologies does not make sense either. I've seen too many projects that suffered from such technoritis.
Unfortunately, the amount of technologies is constantly growing. Not only have we to cope with new Web frameworks, database approaches, programming languages, GUI paradigms, IDEs, multicore-related concurrency mechanisms but also with paradigms such as NoSQL, Clouds, SOA, etc. Whenever many technologies are introduced in the same project, we will inevitably experience the risk caused by the uncertainty principle of not knowing the consequences of all these technologies, especially when they are combined. Thus, architects who do introduce new technologies despite their developers cannot handle this complexity, are doomed to fail.
On the other hand, sticking with the same old technologies and reluctance to learn new ones, has already ruined some companies.
Thus, there are many conflicting forces here.
And there is no easy Stairway to Heaven.; 12:20 AM