RAIDzilla II and Backblaze Pod compared

Note: This discussion talks about the orignal Pod, not their 2nd (or later) generation ones. Also, I've never seen a Pod in person - I'm basing this entirely on their original Petabytes on a budget blog post.

Overview

A lot of people have mentioned the Backblaze Pod to me and suggested it was the inspiration for the second generation RAIDzilla. Nothing could be further from the truth - RAIDzilla II is an evolution of the original RAIDzilla (with more modern components). The Backblaze Pod wasn't publicly disclosed until more than 5 years after I built the original RAIDzilla. In fact, the backblaze.com domain wasn't even registered when I created the original RAIDziila - they came along a couple years later. I first started serious planning for the RAIDzilla II in January of 2009 (as you can see by this post I made on the freebsd-users mailing list).

First, the Backblaze Pod is designed for the most storage capacity at the lowest possible price. The RAIDzilla II is much more focused on speed than price - a RAIDzilla II costs a bit more than twice a Pod and has a little less than half the raw storage space (33TB on RAIDzilla II vs. 67.5TB on a Pod). When your market is backing up data for people on the other end of the Internet, speed isn't really that important.

You'll note that I mentioned above that the Backblaze Pod is targeted at a very different market segment than the RAIDzilla II. So I'm not going to say things like "mine is fast and theirs is slow" here, because both systems were intentionally designed for different uses. The Pod seems to suit Backblaze's needs, as well as the needs of some others who have built their own Pods.

However, I do have some specific criticisms of the Pod. It may be that some or all of these are based on incorrect assumptions about the Pod (I'm basing these on the information avaiable in the Backblaze blog). In particular, this is based on their original blog posts and the subsequent followup post - I haven't been keeping track of what they're doing lately and they may well have addressed some or all of the issues I raise here.

Power

Their original blog posts notes that a Pod will pull 14A at start-up and suggests that a Pod be powered up one power supply at a time. It doesn't seem that the Pod has any internal mechanism to do this (since the motherboard is on the second power supply, it can't be running any code to manage an orderly power-up sequence). A datacenter with a number of pods per rack will likely trip many circuit breakers when coming up from a cold start. Think you'll never have to cold start an entire datacenter? The generators in my facility in downtown Manhattan ran for 4+ days after 9/11, until a combination of (mostly) running out of fuel and overheating / contamination shut them down. The last thing you want to do is to have to physically go to each rack and power on these systems, 1/2 of a pod at a time. Or when you're rolling a new rack into the datacenter to expand capacity.

Sure, there are remote-control power strips that can sequence power to multiple outlets (APC makes a bunch of different models). But you're still looking at a bunch of these to serve a rack of Pods. You'll run out of power before you run out of outlets - even the largest APC 3-phase unit,the AP7990, is only rated for a total capacity of 48A across its 3 120V legs. Even if you can provide the whole 14A at once because you have lots of power available, you'll likely get a very unpleasant surprise when the electric bill arrives - commercial power is often "demand metered" which means you get charged for the highest amount you used at any one time as if you used that much power all the time.

Many SATA drives support delayed spin-up, where the drives power on but don't start rotating until given a command by the disk controller. The RAIDzilla II uses this feature to reduce current draw when powering up. I'm not sure why this approach wasn't chosen for the Pod, as it would get startup current way down. Perhaps the port expanders don't support this function, or the selected disk drives don't.

Fans

I'll take Backblaze's statement that their Pods run quite cool in their datacenters as fact. However, I do have two issues that I think need to be raised. First, if someone is installing a number of Pods in a colocation facility where the facility provides the cabinets, they may be faced with the difficult decision of either vastly reducing the capacity per rack (in order to run the racks with doors on), or reduced security (running the racks with front and back doors removed). Getting the facility to provide doors with full-area venting may or may not be possible.

Second, there doesn't seem to be any monitoring of fan performance in the Pod as described on the Backblaze blog. The picture of a Pod under construction in their original blog entry shows each of the 3 front fans being attached by a twisted pair of yellow and black wires, which is only 2 wires per fan. That means there's no detection of fans slowing down (either due to years of crud, bearing deterioration, etc.) or of a completely stopped fan. Perhaps their web-based management interface polls the SMART data from each drive to report when a "hot spot" develops in the chassis.

Based at my experience at many, many datacenters and telco facilities, I can tell you that alarm lights or audible alarms are almost always ignored at all times - the only time someone visits the equipment is when they get told to go replace a part that has failed. Eventually, fans will fail, and then you'll need to disassemble the Pod to see which one it is and replace it. Which leads to...

Serviceability

Sooner or later, you're going to need to replace something in a Pod - probably a fan or a disk drive. To do this, you're going to need to remove the Pod from the rack. The way the Pods appear in the Backblaze blog posts, they are positioned one on top of the next one. Even with rack slides, unless it is the only one (or the top of the stack), you are going to have a lot of weight pressing down on the top of the case, making it difficult to slide out and even harder to put back.

Drives

A lot of people have taken Backblaze to task for using desktop-grade drives instead of "enterprise" ones, so I won't re-hash those arguments here. And I will be the first to tell you that brand-specific enterprise drives (like Dell-branded Seagates) don't have as much magic in them as you might think. Sure, the vendor will tell you that that is the only supported configuration. Most of the firmware customization is for branding, not to fix problems that only that particular vendor has discovered. It is true that a generic Seagate drive will eventually drop out of a Dell PERC array, but (oddly enough) the generic drive works just fine if you re-flash the PERC controller to the generic firmware as supplied by the controller's manufacturer. I've been in this part of the industry for many, many years (in fact, a company I ran was at one time the second-largest OEM consumer of 40MB ST506 drives, behind only DEC).

However, if you're going to use desktop drives in an enterprise application (however you define that), you want to make them as happy as possible. The up-and-down orientation has not been expressly prohibited by drive makers for many years (the Atasi AT3046 was an early example of of a drive manufacturer prohibiting the up-and-down orientation because it had a linear voice coil, very unusual in 5.25" or smaller drives).

Having said that, a number of drives have come with installaton manuals that state that mounting them vertically in the long axis is not recommended. If you look at the inside of a modern hard drive, you'll see that there aren't any counterweights to ensure balance of the head actuating arm when operated in this position. Even if this doesn't cause any physical wear on the drive, seeks are likely to overshoot (or undershoot, depending on whether they were performed in the "down" or "up" direction), which will result in slower access times as the drive makes another attempt to position the heads. Since this is an uncommon mounting orientation, it is unlikely that drive manufacturers are adding anything special to their firmware to deal with this. If you look at integrators that had previously used the vertical orientation to mount drives (for example, Dell with their XPS R, XPS T, etc.) you'll see that their more modern systems no longer use that orientation.

Then there is the warranty issue. While there were rumors of manufacturers refusing to warranty desktop drives when run 24 hours per day, those appear to have been just that - rumors. That may change in the future, however, as the manufacturers try to further differentiate their products - why would someone pay more for a drive if the hardware was identical except for the color of the label - green, blue, red, black or yellow - and the model number? We've already seen many manufacturers dropping the warranty on consumer models to 3 years, while retaining a 5-year warranty on enterprise drives.