Velocity 2010: Continuous DeploymentPosted: July 7, 2010
One of the things that I found really interesting at Velocity 2010 was the prevalence of the use of continuous deployment. I know I’ve mentioned the Facebook operations talk previously, but it’s worth mentioning again as a good example of this. In it, Tom Cook – a Facebook engineer (sorry, couldn’t find a link for Tom) – talks about about deploying code at least daily, with feature releases once a week. This flies in the face of the “deploy every 2-3 months” model that I’m familiar with. It also requires significantly more developer involvement, with the developer doing the actual deploy and sticking around to support it rather than throwing it over the wall to ops to put in place once the QA cycle is complete.
So, how is this accomplished? Well, without getting into the technical details of the tools they use (watch the video! really!), it essentially demonstrates a completely different culture than a “quarterly installs” sort of model. Obviously, this sort of thing can’t work in a “get every level of management to authorize the install in triplicate” shop. It requires a DevOps-y sort of environment where there is a tight integration between the folks who know the code and the folks who understand the systems its running on. It requires what I heard referred to at the conference as a strong “immune system” – basically, a set of tools (change management, anyone?) and a communication structure that affords a high degree of confidence that a particular install is (a) unlikely to break anything, and (b) can be rolled back quickly with minimal impact if it goes haywire.
I was a bit skeptical of this sort of thing at first, but John Allspaw said something in his Ops Meta-Metrics session that really resonated with me. He said (paraphrasing): “As an ‘ops guy’, I prefer smaller changes more often to big changes less often. Taken to it’s extreme, consider this: what if the change is only 5 lines of code? Does that feel safer? …because it should.” A light turned on inside my head when I heard that. It’s not about deploying fast “because we can”; it’s about deploying fast because it’s the safest thing to do.
Another interesting thing about this is the sorts of deployment models that can be used to mitigate impact if a 5-line code change does happen to break something. One of the most prevalent: not deploying code changes to all at once. Why not deploy it on a handful of servers – or on every server, but with the feature/change/bug-fix only “turned on” for a handful of users? In essence, why not use a relatively small portion of your userbase as unwitting beta testers for your change? Paul Hammond gave some interesting examples of how to handle this sort of deployment inside the code itself in his Always Ship Trunk session.