Monday, 8 August 2011

The operational mentality in software development

A talk by Theo Schlossnagle spurred this line of thought. The description of the talk is "about the evolution of a career in web operations", but he talks more about the importance of thinking operationally by developers. In other words, he takes a different stance about the meaning of DevOps than what is prevalent. DevOps is usually described as increased collaboration between development and operations, with knowledge sharing between the two groups leading to better delivery.

Theo Schlossnagle is of the view that developers need to take the operational view when writing code. It's a very valid point of view and something that I suspect most of us overlook.

I'm going to expand on what he said and share my ideas on that.

Let's take the software you're writing now (assuming it's web software). Is it operable?
It probably does atleast these things

  • Fulfills your requirements document
  • Passes your unit tests and integration tests
  • The UI is usable and responsive

But is it operable? Once it's deployed, can it survive unprecedented load? Fringe cases? Subsystems going down? Third party services it depends on becoming unavailable?

And present a front of graceful degradation as it does all this?


Selective Failure
Most of the time, we stress systems before deployment, using load tests to simulate real world conditions. That takes care of one aspect. But most of us don't think of failures of selective systems, especially when the system is distributed and its components interact in complex ways. The latter is true of most big web applications.Handling selective subsystem failures is not purely an operations responsibility. The application has to be written keeping selective failure in mind.

In the video, Theo brings up an analogy with security. Security is not a feature, but a way of thinking.
Sanitizing user input before putting it in a database query is not a feature.
Not allowing access to your internal web services is not a feature.
These are security restrictions you automatically think of when you develop.


In the same way, operational thinking should not be a something postponed till deployment while writing code. It should be de rigueur in the design and development process.