Velocity 2010: Instrumentation & Metrics

Whether Instrumentation & Metrics was the focus of the talk or just a portion of what was covered by the speaker, two major rules of thumb seemed to present themselves:

“Instrument everything.” Collect as much data about as many things as you can. If it reports, collect it and store it. If it doesn’t report, make it report. And store it. Make sure you keep around as much historical data as you feasibly can. “But Cliff,” I hear you asking, “won’t you just end up with a whole pile of data that doesn’t really mean anything and just takes up disk space.” Well, read on, because that brings me to the second rule of thumb:

“Data ain’t information!” (Direct quote from a talk on modeling and metrics). So…what does it mean? Well, a couple of things. One of the speakers who gave the presentation linked above would have you believe that data + a model is information. Modeling is critical in that it may allow you to extrapolate information from data points that would be otherwise meaningless. Note that I said may; as this presenter noted, “Data is from the devil, models are from God,” in a nod to the fact that real-world data rarely adheres to the nice, uniform curve generated by the model.

The other piece to this is an emphasis on the importance of visualization – i.e., understanding key metrics and how to display them such that interesting/important trends are elucidated. Some examples of this were given in the Ops Meta-Metrics talk, in which John Allspaw demonstrated that code installs and service downtime don’t always have a 1-to-1 correlation…but you will never know that if you don’t track both of those metrics and understand how important it is to compare them over time.

As a side note on metric monitoring, one of the really cool tools people were talking about at the conference was cucumber-nagios, a monitoring tool that allows you to specify configs in natural language. Slick!


