Originally Posted by V_J
The noise actually also helped a bit... although I have the feeling the noise was mainly about semantics: classical raid definition (ajhieb) vs. other products such as snapraid (htpcforever). In general it seems to me that they agree: classical raid (raid 0-6) cannot recover deleted files, snapraid (among others) can recover under some circumstances. While I know snapraid can recover files to some extent, I personally don't consider it a backup: a backup should be able to recover everything, whenever you want.
I hope I did not give the moderators too much work with this thread (sorry markrubin :-))...
For my purpose, I have a short and a long term planning. I know that in the long term, a separate storage server/nas is probably the way to go. But I first want to upgrade the htpc and my storage. Thhe current htpc is not usable as storage server (it is this Zotac Atom mini PC that lacks many extension ports) and my other PC is an old dual Xeon, which is too noisy to have powered on all the time as storage server - but it can serve nicely as backup server. Short term planning is to get a completely new htpc with enough storage for my current library. I don't want any external disks connected to the htpc (I'm going for tight, minimalistic design: design rack with nice receiver and nice htpc case).
I wanted to avoid a typical raid, to have easier dynamic expansion, so I started looking at other options that allow for a more dynamic growth (snapraid, drivepool, ... allow for that).
Now, I'm just thinking of getting 2x4 TB disks in the new HTPC, put some redundancy (not sure yet how: snapraid, backup script that copies some folders from one disk to the next, drivepool, ... - I'm leaning towards drivepool for its transparency) and then backing up to my current disks (I have about 4.5 TB net storage, mix of internal and external disks, which I can connect to the dual Xeon). The things I really want to keep I can backup to multiple hdds and to optical disks.
This would be quite a straightforward system to setup, and would make it easy enough to expand it (adding a hdd is possible, reuse disks in a storage server later, ...).
At the rate my library is growing, this ought to suffice for now (1 year or so), after which I can buy more hdd space and evaluate if things should be more automated or not. Harddisks are never wasted, and can be reused in a storage server/nas later if necessary.
What it comes down to is Real-Time RAID vs. Snapshot RAID.
Real-Time RAID means that data is constantly being written to the parity disk at the same time it is written to the data storage portion of the array. Hardware RAID, ZFS, etc. all use Real-Time RAID. SnapRAID, FlexRAID and unRAID all use or can use Snapshot where data is written to the array or a cache drive and then basically updates to parity at some specified time. Because the data isn't written to both the parity and data portions, you can back up and correct the deletion by recovering the data from the parity drive. Once the data is written to data storage and parity, it's gone. That's the reason you can't do it with Hardware and ZFS, because parity is calculated and written concurrently and the parity data is striped across the same disks as the data storage.
Don't get confused by RAID levels. It's really not relevant. RAID 5 & 6 variants are what the Snapshot RAID systems all utilize.
It's up to the developers of the Snapshot RAID implementation make the undeletion function. It can be done, but there's a timer.
Personally, I hadn't seen a need for RAIDish type things until my library was large enough that I would be reasonably inconvenienced if my array went down. Until that point, Drive Pooling with an external backup for critical data like photos and documents was sufficient. I consider my original Blu-Rays, DVDs and CDs to be the backups of my rips and most of them have been encoded. If I ever lost my primary storage I could simply rerip and re encode.
Currently, the size of the library means that task would take months and I don't want to spend the $$$ for another NAS just for backups so now I RAID 5 with FlexRAID. That way, if I have catastrophic array failure, I only lose data on the failed drives. It may take a few days or weeks to recover an entire drive but that's one reason why I am holding at 2TB drives for now.
Drive Pool + Backup for non-replacable data while the library is smallish
RAID-type implementation + Backup for non-replaceable data once the library is large enough that recovery time of non-RAID storage outweighs the cost of the RAID implementation