storage: raid vs backup - Page 3 - AVS Forum
Forum Jump: 
 34Likes
Reply
 
Thread Tools
post #61 of 75 Old 09-01-2014, 04:07 PM
Super Moderator
 
markrubin's Avatar
 
Join Date: May 2001
Location: Jersey Shore
Posts: 23,099
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 131 Post(s)
Liked: 480
move on please
markrubin is offline  
Sponsored Links
Advertisement
 
post #62 of 75 Old 09-02-2014, 02:33 AM - Thread Starter
V_J
Member
 
Join Date: Jun 2014
Posts: 110
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 31 Post(s)
Liked: 12
Quote:
Originally Posted by morganf View Post
Fortunately, it seems the OP managed to get some useful information out of the thread despite the noise.
The noise actually also helped a bit... although I have the feeling the noise was mainly about semantics: classical raid definition (ajhieb) vs. other products such as snapraid (htpcforever). In general it seems to me that they agree: classical raid (raid 0-6) cannot recover deleted files, snapraid (among others) can recover under some circumstances. While I know snapraid can recover files to some extent, I personally don't consider it a backup: a backup should be able to recover everything, whenever you want.
I hope I did not give the moderators too much work with this thread (sorry markrubin :-))...

For my purpose, I have a short and a long term planning. I know that in the long term, a separate storage server/nas is probably the way to go. But I first want to upgrade the htpc and my storage. Thhe current htpc is not usable as storage server (it is this Zotac Atom mini PC that lacks many extension ports) and my other PC is an old dual Xeon, which is too noisy to have powered on all the time as storage server - but it can serve nicely as backup server. Short term planning is to get a completely new htpc with enough storage for my current library. I don't want any external disks connected to the htpc (I'm going for tight, minimalistic design: design rack with nice receiver and nice htpc case).

I wanted to avoid a typical raid, to have easier dynamic expansion, so I started looking at other options that allow for a more dynamic growth (snapraid, drivepool, ... allow for that).

Now, I'm just thinking of getting 2x4 TB disks in the new HTPC, put some redundancy (not sure yet how: snapraid, backup script that copies some folders from one disk to the next, drivepool, ... - I'm leaning towards drivepool for its transparency) and then backing up to my current disks (I have about 4.5 TB net storage, mix of internal and external disks, which I can connect to the dual Xeon). The things I really want to keep I can backup to multiple hdds and to optical disks.
This would be quite a straightforward system to setup, and would make it easy enough to expand it (adding a hdd is possible, reuse disks in a storage server later, ...).

At the rate my library is growing, this ought to suffice for now (1 year or so), after which I can buy more hdd space and evaluate if things should be more automated or not. Harddisks are never wasted, and can be reused in a storage server/nas later if necessary.
ajhieb likes this.
V_J is offline  
post #63 of 75 Old 09-02-2014, 07:47 AM
Senior Member
 
smitbret's Avatar
 
Join Date: Nov 2005
Location: East Idaho - Pocatello
Posts: 383
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 127 Post(s)
Liked: 47
Quote:
Originally Posted by V_J View Post
The noise actually also helped a bit... although I have the feeling the noise was mainly about semantics: classical raid definition (ajhieb) vs. other products such as snapraid (htpcforever). In general it seems to me that they agree: classical raid (raid 0-6) cannot recover deleted files, snapraid (among others) can recover under some circumstances. While I know snapraid can recover files to some extent, I personally don't consider it a backup: a backup should be able to recover everything, whenever you want.
I hope I did not give the moderators too much work with this thread (sorry markrubin :-))...

For my purpose, I have a short and a long term planning. I know that in the long term, a separate storage server/nas is probably the way to go. But I first want to upgrade the htpc and my storage. Thhe current htpc is not usable as storage server (it is this Zotac Atom mini PC that lacks many extension ports) and my other PC is an old dual Xeon, which is too noisy to have powered on all the time as storage server - but it can serve nicely as backup server. Short term planning is to get a completely new htpc with enough storage for my current library. I don't want any external disks connected to the htpc (I'm going for tight, minimalistic design: design rack with nice receiver and nice htpc case).

I wanted to avoid a typical raid, to have easier dynamic expansion, so I started looking at other options that allow for a more dynamic growth (snapraid, drivepool, ... allow for that).

Now, I'm just thinking of getting 2x4 TB disks in the new HTPC, put some redundancy (not sure yet how: snapraid, backup script that copies some folders from one disk to the next, drivepool, ... - I'm leaning towards drivepool for its transparency) and then backing up to my current disks (I have about 4.5 TB net storage, mix of internal and external disks, which I can connect to the dual Xeon). The things I really want to keep I can backup to multiple hdds and to optical disks.
This would be quite a straightforward system to setup, and would make it easy enough to expand it (adding a hdd is possible, reuse disks in a storage server later, ...).

At the rate my library is growing, this ought to suffice for now (1 year or so), after which I can buy more hdd space and evaluate if things should be more automated or not. Harddisks are never wasted, and can be reused in a storage server/nas later if necessary.
What it comes down to is Real-Time RAID vs. Snapshot RAID.

Real-Time RAID means that data is constantly being written to the parity disk at the same time it is written to the data storage portion of the array. Hardware RAID, ZFS, etc. all use Real-Time RAID. SnapRAID, FlexRAID and unRAID all use or can use Snapshot where data is written to the array or a cache drive and then basically updates to parity at some specified time. Because the data isn't written to both the parity and data portions, you can back up and correct the deletion by recovering the data from the parity drive. Once the data is written to data storage and parity, it's gone. That's the reason you can't do it with Hardware and ZFS, because parity is calculated and written concurrently and the parity data is striped across the same disks as the data storage.

Don't get confused by RAID levels. It's really not relevant. RAID 5 & 6 variants are what the Snapshot RAID systems all utilize.

It's up to the developers of the Snapshot RAID implementation make the undeletion function. It can be done, but there's a timer.

Personally, I hadn't seen a need for RAIDish type things until my library was large enough that I would be reasonably inconvenienced if my array went down. Until that point, Drive Pooling with an external backup for critical data like photos and documents was sufficient. I consider my original Blu-Rays, DVDs and CDs to be the backups of my rips and most of them have been encoded. If I ever lost my primary storage I could simply rerip and re encode.

Currently, the size of the library means that task would take months and I don't want to spend the $$$ for another NAS just for backups so now I RAID 5 with FlexRAID. That way, if I have catastrophic array failure, I only lose data on the failed drives. It may take a few days or weeks to recover an entire drive but that's one reason why I am holding at 2TB drives for now.

Condensed:
Drive Pool + Backup for non-replacable data while the library is smallish

RAID-type implementation + Backup for non-replaceable data once the library is large enough that recovery time of non-RAID storage outweighs the cost of the RAID implementation

Last edited by smitbret; 09-02-2014 at 08:01 AM.
smitbret is offline  
post #64 of 75 Old 09-02-2014, 10:46 AM
AVS Special Member
 
ajhieb's Avatar
 
Join Date: Jul 2009
Posts: 1,602
Mentioned: 7 Post(s)
Tagged: 0 Thread(s)
Quoted: 436 Post(s)
Liked: 434
Quote:
Originally Posted by smitbret View Post
Don't get confused by RAID levels. It's really not relevant. RAID 5 & 6 variants are what the Snapshot RAID systems all utilize.
If you want to say the RAID levels are irrelevant, that's fine, but don't turn around and then immediately make an incorrect statement about the RAID levels.

RAID 5 & 6 both utilize striped data and distributed parity. Those are the defining qualities. None of the snapshot implementations I'm aware of stripe the data. None of the snapshot implementations I'm aware of utilize distributed parity. The only thing they have in common at all is the use of parity information.

RAID 3 or 4 would be closer approximation as both of those use a dedicated parity disk.

RAID protection is only for failed drives. That's it. It's no replacement for a proper backup.
ajhieb is offline  
post #65 of 75 Old 09-02-2014, 02:13 PM
AVS Special Member
 
EricN's Avatar
 
Join Date: May 2002
Posts: 1,244
Mentioned: 2 Post(s)
Tagged: 0 Thread(s)
Quoted: 64 Post(s)
Liked: 201
Quote:
Originally Posted by ajhieb View Post
RAID 5 & 6 both utilize striped data and distributed parity. Those are the defining qualities. None of the snapshot implementations I'm aware of stripe the data. None of the snapshot implementations I'm aware of utilize distributed parity. The only thing they have in common at all is the use of parity information.

RAID 3 or 4 would be closer approximation as both of those use a dedicated parity disk.
I had a professor that enjoyed the aphorism, "Computer Science: the only field where ontogeny actually recapitulates phylogeny" ¹

His point was that in the software industry, the development of a new product over the span few years will replicate the steps the industry took as it progressed across decades. He'd pull case study after case study, showing how companies would rediscover ideas from the 1960s and implement them. For version 2, they move to concepts from the '70s, and so on, reinventing the same wheels and discarding them for the same reasons.

It's what creates this phenomenon. The industry hasn't matured yet, and there's still room for a code-first, design-last approach.

¹ Recapitulation theory
ajhieb likes this.
EricN is offline  
post #66 of 75 Old 09-02-2014, 02:18 PM
Senior Member
 
smitbret's Avatar
 
Join Date: Nov 2005
Location: East Idaho - Pocatello
Posts: 383
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 127 Post(s)
Liked: 47
Quote:
Originally Posted by ajhieb View Post
If you want to say the RAID levels are irrelevant, that's fine, but don't turn around and then immediately make an incorrect statement about the RAID levels.

RAID 5 & 6 both utilize striped data and distributed parity. Those are the defining qualities. None of the snapshot implementations I'm aware of stripe the data. None of the snapshot implementations I'm aware of utilize distributed parity. The only thing they have in common at all is the use of parity information.

RAID 3 or 4 would be closer approximation as both of those use a dedicated parity disk.
Apparently, reading comprehension isn't your strong point. If you'll look closely, the word "Variant" appears in the description of Snapshot RAID.
smitbret is offline  
post #67 of 75 Old 09-02-2014, 02:58 PM
AVS Special Member
 
Dark_Slayer's Avatar
 
Join Date: May 2012
Posts: 2,662
Mentioned: 5 Post(s)
Tagged: 0 Thread(s)
Quoted: 239 Post(s)
Liked: 317
Quote:
Originally Posted by EricN View Post
It's what creates this phenomenon
BT Sync would have been the answer

Quote:
Originally Posted by smitbret View Post
Apparently, reading comprehension isn't your strong point. If you'll look closely, the word "Variant" appears in the description of Snapshot RAID.
When we used to talk about FlexRAID a lot more in these forums it was always described as "closest-to-raid4" and a wiki page eventually came later providing some insight on the engines. http://wiki.flexraid.com/about/raid-engines/ I would expect snapraid to be similar (the page refers to the raid-f flexraid implementation)
Dark_Slayer is offline  
post #68 of 75 Old 09-02-2014, 03:11 PM
AVS Special Member
 
ajhieb's Avatar
 
Join Date: Jul 2009
Posts: 1,602
Mentioned: 7 Post(s)
Tagged: 0 Thread(s)
Quoted: 436 Post(s)
Liked: 434
Quote:
Originally Posted by smitbret View Post
Apparently, reading comprehension isn't your strong point. If you'll look closely, the word "Variant" appears in the description of Snapshot RAID.
RAID 5 & 6 both utilize striped data and distributed parity. Those are the defining qualities. None of the snapshot implementations I'm aware of stripe the data. None of the snapshot implementations I'm aware of utilize distributed parity. The only thing they have in common at all is the use of parity information.

RAID 3 or 4 would be closer approximation as both of those use a dedicated parity disk.

If you'll look closely, you'll notice the word "variant" in your description has no bearing on what I said.

For the record, I agree that for the purposes of the OP the RAID levels don't really matter much. The only reason I brought them up earlier was in regards to a side discussion regarding a claim made by another poster. I think the OP only cares about the functionality and it doesn't matter if a given product is some variation (or as @EricN hinted at, a predecessor) of RAID-3 or RAID-5. IT only mattes that it allows for some level of redundancy for his data.

RAID protection is only for failed drives. That's it. It's no replacement for a proper backup.

Last edited by ajhieb; 09-02-2014 at 03:29 PM.
ajhieb is offline  
post #69 of 75 Old 09-02-2014, 04:06 PM
Senior Member
 
smitbret's Avatar
 
Join Date: Nov 2005
Location: East Idaho - Pocatello
Posts: 383
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 127 Post(s)
Liked: 47
Quote:
Originally Posted by ajhieb View Post
RAID 5 & 6 both utilize striped data and distributed parity. Those are the defining qualities. None of the snapshot implementations I'm aware of stripe the data. None of the snapshot implementations I'm aware of utilize distributed parity. The only thing they have in common at all is the use of parity information.

RAID 3 or 4 would be closer approximation as both of those use a dedicated parity disk.

If you'll look closely, you'll notice the word "variant" in your description has no bearing on what I said.

For the record, I agree that for the purposes of the OP the RAID levels don't really matter much. The only reason I brought them up earlier was in regards to a side discussion regarding a claim made by another poster. I think the OP only cares about the functionality and it doesn't matter if a given product is some variation (or as @EricN hinted at, a predecessor) of RAID-3 or RAID-5. IT only mattes that it allows for some level of redundancy for his data.
Fine, although you will find FlexRAID, SnapRAID and unRAID most commonly compared to RAID 5 & 6 in the real world, I concede that they are more closely related to RAID 3 & 4. Shame on me for perpetuating this commonly accepted myth. I will, henceforth, also stop referring to the 'American Bison' as 'Buffalo', I will also make sure that I use 'Analogy', 'Metaphor' and 'Satire' in their proper sense and also no longer refer to 'Pandas' as 'Bears'.

Together we may be able to change the world.

Fortunately, it still doesn't make a bit of difference what RAID level we call it. Irregardless..... whoops.... I mean Regardless, it's completely irrelevant to the original question. Snapshot vs. RealTime is the difference. As far as RAID levels go, OP could care less...... Oops, these real world inconsistencies are everywhere. I meant "couldn't care less." My bad.

Last edited by smitbret; 09-02-2014 at 04:12 PM.
smitbret is offline  
post #70 of 75 Old 09-02-2014, 04:35 PM
AVS Special Member
 
ajhieb's Avatar
 
Join Date: Jul 2009
Posts: 1,602
Mentioned: 7 Post(s)
Tagged: 0 Thread(s)
Quoted: 436 Post(s)
Liked: 434
Quote:
Originally Posted by smitbret View Post
Fine, although you will find FlexRAID, SnapRAID and unRAID most commonly compared to RAID 5 & 6 in the real world, I concede that they are more closely related to RAID 3 & 4. Shame on me for perpetuating this commonly accepted myth. I will, henceforth, also stop referring to the 'American Bison' as 'Buffalo', I will also make sure that I use 'Analogy', 'Metaphor' and 'Satire' in their proper sense and also no longer refer to 'Pandas' as 'Bears'.

Together we may be able to change the world.

Fortunately, it still doesn't make a bit of difference what RAID level we call it. Irregardless..... whoops.... I mean Regardless, it's completely irrelevant to the original question. Snapshot vs. RealTime is the difference. As far as RAID levels go, OP could care less...... Oops, these real world inconsistencies are everywhere. I meant "couldn't care less." My bad.
"It really isn't relevant, but it's 'irregardless', not 'regardless'"
"It really isn't relevant, but it's 'could care less', not 'couldn't care less'"
"It really isn't relevant, but pandas are bears"
"It isn't really relevant, but 2+2=Cheezeburger"

You see how silly that is when you make the distinction that something is irrelevant and then immediately follow it up by making an incorrect assertion about it?
If you're going to go out of your way to assert that something doesn't matter, then go on to discuss the irrelevant details you're being the very definition of a pedant. And that's fine too. Nothing wrong with paying attention to details especially in a science forum. But don't be shocked when you open up that pedantic door, and someone else walks through.

RAID protection is only for failed drives. That's it. It's no replacement for a proper backup.
ajhieb is offline  
post #71 of 75 Old 09-02-2014, 04:43 PM
Senior Member
 
smitbret's Avatar
 
Join Date: Nov 2005
Location: East Idaho - Pocatello
Posts: 383
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 127 Post(s)
Liked: 47
Quote:
Originally Posted by ajhieb View Post
"It really isn't relevant, but it's 'irregardless', not 'regardless'"
"It really isn't relevant, but it's 'could care less', not 'couldn't care less'"
"It really isn't relevant, but pandas are bears"
"It isn't really relevant, but 2+2=Cheezeburger"

You see how silly that is when you make the distinction that something is irrelevant and then immediately follow it up by making an incorrect assertion about it?
If you're going to go out of your way to assert that something doesn't matter, then go on to discuss the irrelevant details you're being the very definition of a pedant. And that's fine too. Nothing wrong with paying attention to details especially in a science forum. But don't be shocked when you open up that pedantic door, and someone else walks through.
I hope that was an attempt at humor.

BTW, I was wrong about Pandas. Recent genetic tests say they are bears. I was working off old material that said that they were more closely akin to the Raccoon.

Last edited by smitbret; 09-02-2014 at 04:46 PM.
smitbret is offline  
post #72 of 75 Old 09-02-2014, 04:52 PM
Senior Member
 
Aryn Ravenlocke's Avatar
 
Join Date: Aug 2006
Location: Tempe, AZ, USA
Posts: 485
Mentioned: 2 Post(s)
Tagged: 0 Thread(s)
Quoted: 230 Post(s)
Liked: 78
Quote:
Originally Posted by smitbret View Post
I hope that was an attempt at humor.

BTW, I was wrong about Pandas. Recent genetic tests say they are bears. I was working off old material that said that they were more closely akin to the Raccoon.
Don't feel bad, the information is still fairly recent in the grand scheme of things, and it's not like the genetic studies were exactly widely published. Heck, you could be one of those elementary school teachers being corrected by her students that there are really only eight planets in the Solar System now.
Aryn Ravenlocke is online now  
post #73 of 75 Old 09-02-2014, 04:58 PM
Senior Member
 
smitbret's Avatar
 
Join Date: Nov 2005
Location: East Idaho - Pocatello
Posts: 383
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 127 Post(s)
Liked: 47
Quote:
Originally Posted by Aryn Ravenlocke View Post
Don't feel bad, the information is still fairly recent in the grand scheme of things, and it's not like the genetic studies were exactly widely published. Heck, you could be one of those elementary school teachers being corrected by her students that there are really only eight planets in the Solar System now.
Yep, it still upsets me that I was sent to the principal's office in 4th grade because I wouldn't stop correcting my teacher about Bison vs. Buffalo.

I'm 41
smitbret is offline  
post #74 of 75 Old 09-02-2014, 05:06 PM
AVS Special Member
 
ajhieb's Avatar
 
Join Date: Jul 2009
Posts: 1,602
Mentioned: 7 Post(s)
Tagged: 0 Thread(s)
Quoted: 436 Post(s)
Liked: 434
Quote:
Originally Posted by EricN View Post
I had a professor that enjoyed the aphorism, "Computer Science: the only field where ontogeny actually recapitulates phylogeny" ¹

His point was that in the software industry, the development of a new product over the span few years will replicate the steps the industry took as it progressed across decades. He'd pull case study after case study, showing how companies would rediscover ideas from the 1960s and implement them. For version 2, they move to concepts from the '70s, and so on, reinventing the same wheels and discarding them for the same reasons.

It's what creates this phenomenon. The industry hasn't matured yet, and there's still room for a code-first, design-last approach.

¹ Recapitulation theory
That's an interesting take on things. I had never really given it any thought, but I suppose that is true in many instances. Somewhat fitting of that, spectrumbx indicated on another thread that he was going to implement many of the standard RAID levels in a future version of FlexRAID. The circle of life I guess.

Though I would say the theory applies to a lot more than Computer Science. I think in most endeavors to build a better version of something that already exists, you can either start with the product you're trying to improve, and basically inherit its evolutionary steps, or you can start from scratch, which often takes you down the same evolutionary path.

RAID protection is only for failed drives. That's it. It's no replacement for a proper backup.
ajhieb is offline  
post #75 of 75 Old 09-03-2014, 01:50 AM - Thread Starter
V_J
Member
 
Join Date: Jun 2014
Posts: 110
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 31 Post(s)
Liked: 12
Quote:
Originally Posted by EricN View Post
It's what creates this phenomenon. The industry hasn't matured yet, and there's still room for a code-first, design-last approach.
But still... bandwith wise... http://what-if.xkcd.com/31/
(about 6 years ago, we transfered data by shipping 500 GB harddrives: you'd be surprised how difficult it was to transfer huge files - in our case raw satellite images that were over 25 GB each - over the internet without any corruption or checksum failures. Haven't tried it recently, so I don't know how it improved)

Quote:
Originally Posted by smitbret View Post
As far as RAID levels go, OP could care less...... Oops, these real world inconsistencies are everywhere. I meant "couldn't care less."
Not being a native English speaker, I won't comment on the sentence or which is the correct one.
But I do care to some extent what the raid level is. Pure duplication is more expensive per GB than a parity based system; you need more gross storage to have the same amount of net storage. In addition you need more HDDs to pull of the same storage (assuming equal HDD size), so SATA ports and cabinet become an issue.

I've come to the same conclusion that smitbret posted. If the collection is small enough, it is manageable. And I've come to realize that, as long as the HTPC is also the storage server, there are too many complications (os, usability, ...) to properly adjust things. Once the collection needs to be on its own server, then I can improve on it.
V_J is offline  
Reply Home Theater Computers

Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off