Hitchhiker's Guide to Software Architecture and Everything Else - by Michael Stal

Saturday, May 26, 2007

Release It!

A lot of Java and .NET developers are involved in Enterprise and Web development projects. According to official statements by senior management or marketing of software development companies or departments, no one ever seems to have faced major problems in such projects, not to mention catastrophes. However, if you meet other developers at all those software engineering conferences worldwide, you will hear some war stories in personal communications - maybe after a few bottles of beer or wine. According to Henry Petroski "learning from failure" is one of the most important issues in software development. We all have made faults in the projects we were involved. Thus, one way could be to forget those errors in the next project and cause the same problems again and again and .... Do you know the de ja vu feeling when you think in a concrete project situation "why do they make exactly the same mistakes like in previous projects". The other way to address these project failures or problems in a constructive way is to learn from them without trying to identify a scape goat, because attacking people won't be of any help here. It is the whole organization including all stakeholders that should be ready to learn from failure in a positive way. Recently, I read a book on this issue which was written by Michael T. Nygaard and published by "The Pragmatic Programmers" - I really love the books from "The Pragmatic Programmers", to be honest. It is called "Release It! Design and Deploy Production-Ready Software" and covers problems related to software production and deployment. The author illustrates potential risks, he derived from real life projects, as well as countermeasures to cope with these risks. It is exactly the kind of book that is helpful for practioners like me. The book is partioned in four parts: part I deals with stability problems, part II with capacity problems, part III with general issues and finally part IV with operations of software systems. I really enjoyed reading this book and learned a lot. I am too lazy to go into all the details, because it is really better for you to read the book yourself. I'd like to see more books of this kind. Of course, I understand that no company would be willing to share all its failures with the rest of the world, especially with its competitors. But unfortunately, only a few companies have a philosophy of learning from their own failures. Thus, books that abstract such problems from the real projects, and show potential pitfalls and their solutions in a more general way could be a treasure chest for all software engineers. Believe me! Documenting patterns is much easier as they represent proved solutions. No one will complain about documenting patterns for this reason. But I won't give up trying to convince everyone that a culture of learning from failure would help us being more productive and successful. At the end, most technical (r)evolutions are caused by learning from previous solutions that reached their limits. Maybe, someone has read such books and can give some recommendations. More information on the excellent book by Michael T. Nygard is available here.

1 Comments:

  • After reading your review of the book I immediately bought it in a store next to my condo. Great content! I experienced several devu-jus especially while skimming through the stability antipatterns chapter: blocked threads, SLA inversion, users, integration points. Everything came back to my mind. Keeping that in mind and how to prevent/resolve these problems I was able to apply the circuit breaker and timeout pattern in my new assignment to fix an existing production problem. A must read! Thanks for pointing me to the book.

    By Blogger Benjamin Muschko, at 3:26 PM  

Post a Comment

<< Home