Snyder’s First Law of Capacity Planning

There’s a phrase I hear all the time that’s been bothering me for a while now. I hear something like it about once a week on average when I ask about requirements for a system that needs to be provisioned. It usually goes something like the following:

“Well, I think we’ll need 20GB, so let’s go ahead and ’round up’ [sic] to 50GB…y’know, just to be safe. Storage is cheap, right? I mean, just yesterday I saw a 1TB drive for like $60!”

…and it drives me nuts.

I’d like to apologize ahead of time for using “you” below if you, personally, have never said this to me. It’s conversational in voice, not accusatory…

First of all, what you’ve done here isn’t “rounding up” in any way, shape, or form. What you’ve done is increase your requirement by 150%. Second: Yes, perhaps storage is cheap relative to other system components – like, say, RAM. However, throw in high-speed SCSI disk, multiple spindles for performance and redundancy, a support contract, support subsystems like SAN switches and cabling (also redundant), support staff costs, etc. and it’s significantly more expensive per-GB than that Hitachi SATA HDD you saw in the bargain bin at the Office Depot. Third, if you put two-and-a-half times more disk into every request than you really need, that means we have to buy two-and-a-half times more disk. Generally speaking, you know what that means as a function of cost? You guessed it! We’re now spending (approx.) two-and-a-half times more on storage than we need to! Taking all this into account, it gives the original statement the flavor of “Let’s spend more money than we need to because it’s cheap!”

Now, if this storage were just a one-time, fixed-cost item then it really wouldn’t be that bad…but it’s not. At this point, I’d like to coin a new Law – we’ll call it Snyder’s First Law of Capacity Planning. It goes something like this:

“A resource – once allocated – has a probability of being de-allocated of near zero.”

Applied to this problem, it becomes “Once storage has been allocated, you will very likely never get it back.” Not only will this storage be continuing to use power, cooling, etc. ad infinitum, but we will now need to take it into account in every storage subsystem outage, maintenance, and migration from now until the end of time.

On the other hand, the above is precisely the kind of thing I get paid to be mindful of so that the people who are coming to me asking for resources don’t have to. So, in the interests of continuing to make a living while remaining sane, how can I make this better?

Well, for one thing, it’s seemed a little bit silly to me for quite a while now that any incoming requester should be asking for resources this specifically. What I mean is: I’m the Systems Guy, shouldn’t I be taking incoming requirements and translating them into specific resources? So maybe a better approach to the problem would be asking questions of the form “How much data do you have and what are you going to do with it?” rather than “How much disk do you think you’ll need?” (Of course, this won’t work in all cases, but it seems like it could be a fairly effective approach most of the time.)

Another way to handle this when asked for more storage than is necessary might be to have numbers on-hand going into these meetings. This way, when presented with consumer storage costs of something like 6 cents/GB I could “counter” with, say, $14/GB (or whatever it actually costs – I’m just making up a number here). A little end user education might go a long way.

I’m sure there are other approaches that I haven’t thought of yet; I’ll be sure to update the post if I think of any. I’d be glad to hear any ideas you’ve got in the comments…