Old thread I know, but EricN, you are officially my favorite forum poster
Raid, hard of soft, is not a backup solution.
As EricN mentions, the real advantage of RAID is in protecting system access, it REDUCES DOWNTIME when a disk fails. Raid *may* be useful in mitigating the risk of hardware failure, but it is NO substitute for proper backup solutions.
Personally, I would consider even RAID + Backups to be insufficient in this day and age of massive data storage. IMHO any real solution would included complete mirroring of any server(s) in a server-"hot-swap" scenario in addition to the aforementioned (common) approach, and also the utilization of cloud services where practical/feasible. Of course physical separation of redundancy solutions is a must for any serious data warrior.
I would imagine for the average HTPC enthusiast, mirroring terabytes of video files on a separate machine is likely an impossible dream in terms of labour, cost, and deemed priority. I know I personally rely on RAID setups when managing my movie and tv stores. "Important" data however, the kind that is irreplaceable, or representative of many many many hours of work, I have duplicated in drive bender, and additional backed up on both a physically separate machine and in the Cloud.
A couple of examples where RAID would not help you:
1. Human error
There are many ways that you can do something wrong and lose data, but here are two common ones:
Accidental file deletion: We have all accidentally deleted a file or two (or more). In this case, deleting a file from a RAID array is no different than deleting it from a single drive. If you really need to recover the file and don't have a backup, you can use try data recovery software and generally use the same approach as if it were a single drive. The success rate varies with the filesystem type and overall situation, but it is nowhere near 100%.
Making a mistake when working with a RAID: This can be as simple as pulling a good disk in a failed array by accident. Other failure "opportunities" arise during resync of a failed array, RAID level migration and/or RAID expansion. The latter is particularly error-prone since it involves multiple disk swaps and resyncs. One wrong step and your data is gone.
Even with products that provide "automatic" RAID recovery, success is not guaranteed. Poor documentation and badly designed user feedback mechanisms (status and progress displays) can cause users to do the wrong thing and the wrong time and mistakenly kill the recovery process.
2. RAID controller / software failure
RAID arrays can be managed by dedicated hardware RAID controllers, RAID software or a combination of both. Both can fail. Data can be recovered, however using a backup to recover data is significantly faster.
For example, if a controller fails, you need to either purchase exactly the same controller and try to recover array in the original configuration, or to recover array parameters using special RAID recovery software. In the latter case, you need to provide storage to copy recovered array data as well.
Keep in mind that in both cases, recovery takes from several days to several weeks. To repeat: recovering from a backup is significantly faster! Although you might say "Oh, that's all right, we will wait as long as necessary", in practice, it always turns out that the data is very important and needed right away. Once the actual problem happens, no one will be willing to wait a week.
Of course, there are cases where a malfunctioning controller scrambles data so badly that it cannot be "cured" by data recovery software.
3. Fire, flood or other calamity
Your RAID can have redundancy, hot spare disk, be protected from a controller failure, be connected to an UPS, etc.. Nevertheless, your RAID—or data on a single drive—can be destroyed by fire or other calamity. In such a case, only regular backups stored off-site can recover lost data.
We had a case where a flood did not directly affect the storage arrays, but created enough humidity in the room to force a controller to initialize the disks without a command. Unusual, perhaps, but the data was still gone.
4. Theft, hacker attack, or other offensive action
Anything can be stolen and RAID is not an exception. Especially as modern data storage devices become smaller, they become easier to steal. Modern encryption systems may prevent a thief from accessing confidential data. But encryption doesn't help you to get your data back. As in case of fire, flood or other calamity, the single thing that helps you to recover data is a backup.
If you have ever been hacked, or even caught someone messing with your computer or NAS, you are confronted with a choice. You can go through your files one by one looking for lost or modified data. Or simply recover the data from a backup and go on with life.
In this case, it's important to have more than one backup, or use versioned backup, in case the hack is subtle and remains undiscovered for days, if not weeks.
5. Multiple disk failures and URE
A RAID5 array protects your data against a single disk failure, while RAID 6 can withstand up to two disk failures. If the disks fail independently, the probability that the second (or third) disk fails before the RAID is restored is negligible. In real life, however, disks can have much more in common than it might seem.
Disks used in a RAID are usually the same model, often from the same manufacturing batch and sometimes even with sequential serial numbers. All these disks work under the same load and are subjected to the same environment - temperature, vibration, and power spikes. More than that, if a disk has some factory defect, as in the Seagate 7200.11 disks, the entire set is likely to develop the defect nearly simultaneously.
In a RAID 5, you can encounter the so-called URE (Unrecoverable Read Error) problem associated with noticeable probability of a read error occurring when rebuilding an array after a disk failure. However, modern drives are so reliable that the URE issue is no higher than the third in the list of cases requiring RAID recovery after human error and multiple disk failure.
I'm not lulled into thinking that by running unraid, flexradi, zfs, drive bender or any number of other similar solutions, that I have somehow "protected my data". Many RAID proponents fail to acknowledge the simple fact that RAID is inherently vulnerable to numerous threats to data integrity. Having a system that has been running stable for 5 years says nothing for the safety of your data, as that can only be tested when things go wrong. Simple google searches on the topic of RAID and backup will quickly illustrate how many serious tech people feel that relying on RAID as data protection is a fools game.
I am pretty much a lurker on these boards, and I know from reading that there are a lot of members who feel differently about the utility of software such as flexraid and it is not my intent to insult or in anyway diminish your experiences and opinions. My primary interest is in expressing views and having some constructive discussion on the topic of data protection.
Having said all that i have, it should be noted that I am not, and never really have been, a tech person by trade. My interest in the subject is for the more practical reason of wishing to ensure that I am using the "best" solutions in the long term as I setup and integrate devices into my home and lifestyle. I am in the process of deciding what setup I will use when I finally migrate my AV collection from the current WHS solution, and flexraid has been striking a chord with me, primarily because of the great community (that's certainly what has it beating unraid).
Like everybody else, I don't want to end up in a situation where I have to re-rip multiple terabytes of video content. Nor do i personally wish to be complacent in thinking that flexraid has eliminated the risk of that happening.
I think my most likely scenario at this stage will be the dual server approach. I have two N40L boxes, one doing no much at all, the other acting as my WHS media server. I am wondering if anybody else is running duplicate servers, and if so what they have based their setup on. I do have two WHS licenses, but am not adverse to the idea of running mirrored flexraid setups if it is warranted. My problem lies in the fact that as i said , I am not a tech guy, and my experience in configuring such a setup is zero :/
Any thoughts or suggestions as to how I should approach this, or if even indeed it is a worthwhile task?