The focus on drive speed is a case of the streetlight effect
. Measuring throughput is easy. Measuring thermal, noise, & load characteristics is hard. Measuring reliability is extremely hard. This is why the vast majority of hard drive reviews are just freeware benchmarks run on a few drives connected to some random hardware with enough charts & graphs to spread the review across a dozen ad-riddled pages.
As Dameleon points out
, we don't care about throughput. Everything relevant to media storage gets covered in that tiny penultimate paragraph on the 12th page labeled "Heat, Noise, & Reliability". Some sites used to take it seriously, but even StorageReview
stopped lab testing heat and noise on consumer spindles years ago, and SPCR
never reviewed enough drives.
The reliability question is even harder. Even review sites that sink real effort into answering it don't succeed. StorageReview's Reliability Database
demonstrated that the plural of "anecdote" is not "data". A consistent test environment is crucial. Hardware.fr
has been publishing retail return rates for specific drives models, but they only cover the first year and they only have sample sizes of hundreds to a few thousand units...which is too small and too indirect
of a proxy to distill failure rates from.
The best data is still the CMU study
and Google study
, each measuring the failures of over 100,000 drives . The biggest surprise
was that there was no "bathtub curve
" effect for HDD failures like there is with other hardware. The Annualized Failure Rate (AFR) starts out low (~1-2%) in the first year and keeps climbing from there. When the warranty expires, it's already between 5% and 10%. There is no "flat of the tub" grace period. The other big surprise was that failure rate didn't correlate with load; it didn't matter if the drive was in constant use or mostly idle or mostly spun down.
The question that we're trying to answer is "What are the 3rd and 4th year failure rates?" During that still-in-service-but-out-of-warranty period, when we have to eat the cost of the failed drive, is it 5% or 10%? It really matters once you start talking about multiple drives. If you calculate the chance of at least one drive failing in an n-drive array, this is what it looks like:
Chance of one or more failures
over time in an n-drive array
| Year & AFR |
n |Y1 1%|Y2 3%|Y3 5%|Y4 7%|
1 | 1% | 4% | 9% | 15% |
2 | 2% | 8% | 17% | 28% |
3 | 3% | 11% | 24% | 39% |
4 | 4% | 15% | 31% | 48% |
5 | 5% | 18% | 37% | 56% |
6 | 6% | 22% | 42% | 63% |
7 | 7% | 25% | 47% | 68% |
8 | 8% | 28% | 52% | 73% |
| Year & AFR |
n |Y1 2%|Y2 5%|Y3 8%|Y4 11%|
1 | 2% | 7% | 14% | 24% |
2 | 4% | 13% | 27% | 42% |
3 | 6% | 19% | 37% | 56% |
4 | 8% | 25% | 46% | 66% |
5 | 10% | 30% | 54% | 74% |
6 | 11% | 35% | 61% | 80% |
7 | 13% | 39% | 66% | 85% |
8 | 15% | 44% | 71% | 89% |
Running multi-drive arrays for a few years magnifies a tiny difference between two models, up to a point. Beyond about 8 disks, having failures is unavoidable, and makes sense to stock up on spares in advance (like light bulbs), but the typical SOHO NAS/media server isn't quite that big, and can still really benefit from using the more reliable drives
Getting the details on those failure curves is tricky. Manufacturers only give two data points: MTBF & warranty length, which aren't enough to extrapolate from. Even with CMU & Google, by the time they collected and analyzed the data, the drives they studied were end-of-life, and the data somewhat useless as a shopping aid. The best source I know of is the warranty underwriters
, who actually get to peek behind the manufacturers curtains and see the raw failure data, and they have to wager their own money on predicting the reliability of the drives currently for sale.
tl;dr: A small difference in reliability multiplied by several drives and several years becomes meaningful. The WD Reds are probably worth it.