Change the way it hurts

    It is not the strongest of the species that survive nor the most intelligent, but the ones who are most responsive to change.

    something Charles Darwin didn't say, but everybody thinks he did

First things first. Let's accept this - we're not so clever. That's why we suck at managing the change. And don't be afraid to admit it. It's natural. Who pretends otherwise is just seeking ways how to cushion the awkward truth. You might cover it by fancy frameworks, hide behind endless bureaucracy or blame somebody else. Your choice.

Now, if you compare the Clarence Darrow's quote above (yes, he's the real author) with the traditional understanding of the change management you might realize there's a significant difference. Whereas one is talking about the ability to adapt to a change, overseen or not, we are continually witnessing a desire to improve the likelihood of success only.

In context of the software development, there's only so much testing you can do to insure the successful release. And don't get me wrong - I'm not trying to suggest to test less, quite opposite. But there's always only a limited coverage you can achieve, either due to a budget constraints or sheer ability to predict uncertainty in the complex systems. Then you deliver. And it hurts. How do you manage that?

The difference might be a way how you adapt. We're already establish there's always going to be some element of pain behind change. Wouldn't it therefore be better to change the way it hurts? Just ask yourself following three simple questions:

   1. Who suffers by the change? Is it the developer? QA? Operations? Customer support or customer itself? The farther away from whoever the initiated it the less probable is to learn from it. Each step you introduce between the pain source and target will result in some information noise.
   2. How easy is to write a new code (i.e. introduce the change)? Are you new starters (and definitely not your regular developers) afraid to change things because they are afraid to break them?
   3. How fast you see something went wrong? You need to connect the change itself with negative reaction, exactly as you would do with your dog :) There's no real learning happening in something that happend weeks or months back.

For me that's about change of internal culture. Continuous deployment calls for responsibility, and you won't get it by removing the experience of value creation from your core business (developers for that matter).

Or you can look for yet-another-framework, introduce more bureaucracy or blame somebody else...

* I would be more than happy to quote original author of the three questions as mentioned during Devops Days US 2010, but unfortunately don't know who was it. Please, let me know if you do.

The Case For Continuous Deployment

Goods require shipping. Software should not. Despite the IT industry successful move to detach the release management and need to physically transport the product, the ‘shipping’ is still creeping into the software development nowadays. Well, it’s even worse - it got embedded into our thinking. We don’t release, we ship. Those two became synonyms. What’s wrong with that, you ask?

Shipment of the product by itself doesn’t add any value to the customer and represents direct cost for the software vendor. We can call it waste. As we are taught the best thing we can do to eliminate waste is to try to reduce the activities where such a waste is produced. So, let’s not ship everything. Yes, the obvious and safe bet is to bundle as many features as possible to a single release. Iteration cycle time is getting longer and therefore expensive. If you want to bet the company’s well-being on a single (even though repetitive) milestone, you better invest some serious effort to make it worth while. That means testing. The safety net of testing of software is focused on improving testing coverage. Unfortunately as your product grows, QA has to do exactly the same. Overhead is increasing. Just when we assume we have reduced one type of waste we actually introduced the new one - waiting. Now we got two types of waste. Waiting and shipping.

Wait a second! Didn’t we just reduce the impact of the latter? Yes, but only at the end of the release cycle. In the process described above you still have to ship the product, and every time you do so waiting pops in. Waiting before the development team reach the agreed milestone, waiting to test the features, waiting to deploy the product, waiting to educate the user.

If you feel the pain of the process we’ve just described you’re on a good way to start fixing it. Luckily for us there’s already a way out of this mess. Internet not just significantly reduced the shipment cost incurred by a software vendor, it transformed the industry altogether. New vendors became service providers. Subtle change you might say, but it comes with a great simplification. Where software house ships the product to a thousands (or millions) of customers, service provider does it only to one - its internal operations team. No more hard to predict environments, no more unnecessary delays incurred. The pain has moved to a new level, from internal users to external ones.

Close proximity to the customer has already been accepted as one of the main advantages behind Agile movement. Shorter iteration cycles enabled faster feedback, automated testing reduced waiting time between development and testing. Product creation phase started to move faster. Stories got shorter, iterations more aggressive. In such an environment you have to pull down any obstacles to make the flow as continuous as possible in order to eliminate waste.

Traditional paradigms of the software development are now holding one last wall to conquer - deployment itself. In a situation where you have only one customer (you) there’s no more space for a separation of concerns. If there’s no cost behind operations talking to development do you really need to keep them apart? Most likely not. If you already automated most of your product life-cycle, don’t you want to conquer the last mile and automate the deployment as well? Yes! Move from continuous integration to continuous deployment.

I accept it might sound scary at first, because you we are effectively going to remove the last safety net, but you are also going to remove the separation between ‘us and them’. Developers will need to start thinking about every single line of the code in the context of deployment, start thinking as Operations team. And not to stop there Operations will need to start to behave like developers. Both sides have a unique processes and experiences learned over the time that can help the other camp.

And no, you can’t just wake up one morning and switch to continuous deployment. You have to work towards it, reduce waste over time. The implementation of smooth product flow will naturally exposes bottlenecks, expose the quality problems and therefore naturally leads to a reduction of the waste.

Let’s create software, not a new ways how to ship it somewhere.

For non-believers and pessimist the concept of continuous deployment isn’t really something new and coming out of a blue. It’s just a naturally extension of something called continuous flow applied in manufacturing and if you are interested you can read more about it online - Lean manufacturing.

Sustaining a 14-day Release Cycle

We talk about it all the time - agility. At GoodData it’s in our DNA. We understand being agile is not necessarily easy. You need the right people, attitude and tools. And exactly as we’re trying to deliver the GoodData platform to help our customers bring agility into a BI arena, we need the very same tooling internally to sustain the pace of a 14-day release cycle. We bring our customers new features every two weeks - unheard of in the stodgy world of BI and analytics.

Let’s get some perspective since 14 days might not sound so difficult for agile developers. How do you stay confident that all 2,713 data marts, 1,344 dashboards and 19,086 reports (our internal count as of June 2010) are valid from release to release? For a long time, we performed very expensive and time consuming validation directly with our customers. But how many reports you can actually test? Where does the attention to every single detail end and become a boring chore?

It turns out it’s at the second report you have to validate. Then you have to add security and privacy concerns. We motivate our customer to be as self-service as possible, so why we should destroy this experience by a need to manual verification? GoodData is about the Cloud. And the cloud need automation to scale.

A few weeks back, we quietly and successfully deployed our Regression Testing Toolkit. Its purpose is simple: take two different code branches and select data mart(s) you want to compare and bang - results. All anonymized, with no data being exposed, and still providing enough information for our engineers to fix the possible problems. For me and you, the results might look a bit sci-fi but engineers love them, which is critical if they need to resolve the problems as fast as possible.

Here’s what it looks like:

Tumblr_l4j818agtn1qax0nl
For the rest of us, the final OK message is pretty much what we are looking for.
Tumblr_l4j81si4zm1qax0nl
Then there are the unexpected benefits.

As the devops supporter, I’m glad we found a way to get our engineers more engaged in our production environment. We got also a new performance tracking tool as a by-product. For every single report, and as aggregate for whole data mart, we now have a runtime value that is easy to compare across the releases or different environment. That’s what I call getting the good business value from our engineering team!

No matter how passionate we’re about this tool, there’s more work needed. And hopefully soon you will be able to see validation of your data and model directly. No delays. Enable you to iterate through the changes in your data model faster.

Building confidence and trust for on-demand solution is a long run. Tools like our Regression Testing Toolkit helps us to go that extra mile and me, an ops guy, help to sleep better…