AVS Forum banner
  • Our native mobile app has a new name: Fora Communities. Learn more.

DIY terrabyte media servers - an open discussion

6886 Views 100 Replies 40 Participants Last post by  mattdisaster
Hi guys,


There has been quite a lot of talk recently concerning large media servers, and there seems to be a fair amount of disagreement as to what is the best way to handle them in terms of reliability, flexibility, and cost effectiveness, so I thought it would be a good idea to start a thread specifically for the do-it-yourselfers among us (which I believe represents a significant percentage of the membership here) to openly discuss different ideas as to how to accomplish our goals. I've seen people hijacking some of the other threads and figured that I would open a discussion where no specific "special interest" group will be in a position to be attacked and/or criticized. In this thread, as long as we adhere to forum rules, please feel free to express your opinions concerning RAID controllers and types, as well as SCSI vs. EIDE issues, case types, costs, etc.


Let me start off by telling you about my recent HTPC that I built. I now have a P-IV based machine with 512 megs of RAM, running Win XP Pro, with six 80 gig EIDE drives installed, so the machine is sort of doing double duties as a HTPC and a media server. On my MB's built in primary controller I have two 80 gb Maxtor Quiet drives installed, and on the secondary controller I have a Toshiba SD-M1612 drive installed as my main DVD drive, and a SD-R1202 combo drive as a backup player and CD reader/writer. Then I installed a Promise ATA/100 controller and attached four Maxtor 80 gig 5400 RPM drives to it for media storage. This gives me a total of 480 gigs of space in a single machine, without using RAID. I have a similar machine elsewhere in my house with six more 80 gig drives, also being used as a video encoding machine and a media server, with all machines networked (of course :) ).


I had considered using a RAID controller, but I really don't understand the need or usefulness of such a setup. First of all, without using RAID, I can pull any drive out of any machine, install it in another machine, and the data remains intact. If I use something like RAID 5, wouldn't I lose that flexibility, as the drives would all work together to act as one large drive? Also, if one of the drives bit the dust, wouldn't I lose ALL of my data, instead of just one drive's worth? And if I used a different flavor of RAID in order to gain redundancy, wouldn't I lose half of the storage space in order to be protected against drive failure?


So to open the discussion, let's have some opinions and ideas as to when RAID should be used, and why it is necessary or even desirable in home media servers. If RAID should be used, what are the best controllers, which are the most cost effective ones, and where is the point of overkill? What type of RAID is most appropriate, and what kind of performance is needed in home media serving? And since I am not selling anything or representing any company, ALL opinions are welcome :)
See less See more
Status
Not open for further replies.
1 - 20 of 101 Posts
I read a while back that MSI came up with a non-legacy motherboard design which supported 12 IDE devices.


12*120gb drives = 1440gb. That's pretty nice.


Of course, you can always buy one of those 4xIDE Raid cards, they're a bit pricy, but you get raid reliability (mirroring, hot swapping) and each card can take on 120gb.


Get a mobo with built-in raid, add 2 raid cards, you got 12 Raid-Supported drive, and it won't set you back more than $4000, probably less (hey, if you go to a store and buy 12 drives, you can surely ask for some discount).


The benefit of raid is actually speed, if you treat 4 drives as one big space, you get a lot faster writes. And when dealing with huge files, it can be a benefit.


I think the options will be better once Serial-ATA kicks in, but even now, there are quite a few nice setups you can build.
See less See more
It's a common misconception that RAID = speed but, as a professional database geek, I can tell you that's simply not true. No RAID level guarantees a higher read speed and almost all of them will slow writes down, sometimes very significantly.


RAID level 0 is disk striping, without any sort of redundancy. Of course, this means it isn't really a RAID level at all, since the R stands for redundant. This is the only level that's virtually guaranteed to allow both reads and writes at the same or higher performance than a single disk. For us HTPC types, it may not give any performance improvements, since we tend to read and write long, single streams of continuous data (VOB, MP3, ATSC, etc.). RAID 0 shines with random reads and writes or multiple streams.


RAID 1 is disk mirroring, which cuts your storage in half, improves your reliability by nearly 100%, nearly doubles your read capacity, and possible cuts your write capacity almost in half since every write has to happen twice.


RAID 2, 3, and 4 we'll ignore, since they're essentially never used in general purpose storage.


RAID 5 is what most people talk about when they talk about RAID at all. Level 5 adds a bunch of disks together into one big virtual disk and adds parity checking to all writes (and possibly all reads, when configured). The read speed of RAID 5 is quite a bit higher than most other RAID levels, since even streaming reads can use all available disks almost equally. Some measure of read performance is lost if read parity is enabled, so it rarely is. RAID 5 writes, on the other hand, take quite a large performance hit due to the overhead of calculating parity and writing data in strips across all disks simultaneously. One place where RAID 5 is ideal for HTPC geeks is that the write penalty is very significantly lower for large streaming writes. Random writes on RAID 5 can incure an enormous penatly and are frequently slower than single disk writes. Some, but not all of this penalty can be mitigated by caching.


I personally would recommend RAID 0 for most HTPC types, on the assumption that most data that would be lost should be pretty easily recoverable. After all, you DO own all those CDs and DVDs you're ripping, don't you? ;)
See less See more
An addendum to cord's very good post...


Re: RAID5 and storage capacity. Generally RAID 5 is configured with N disks (N >2) of equal capacity, and 1 disks worth of space is used for parity data. So the total usable size of a RAID5 array is


TotalSize = DriveCapacity X (N - 1) where N = # of disks in array.


I realize many ppl know this, but just to make the discussion complete :p


On another note, I've heard/read very good things about the 3ware Escalade series of RAID controllers, but I haven't used them personally, and do not know what they cost.


And, personally, my OS of choice for a fileserver is a stable flavor of Linux, running Samba. Linux supports many/most ATA raid cards (and straight addon IDE cards) and Samba is relatively easy to setup (okay, I've done it many times, so it's easy for me at least). Additionally, because of the way that discs are mounted in Linux (on a directory level, rather than a drive Letter level), it's very easy to add additional drives without having to create new shares...just mount the new drive to a new directory in the same share. I use this method to add new disks to my "movies" share....need more space, just add a directory (in linux) named "disc2" and mount it. No new share names needed....


Cheers,


Rich
See less See more
Cord - When you are dealing with THAT much data tho, would you really want to re-create it all in the event that it did blow up? Think about how much time it takes to Rip all your stuff (even assuming that you own everything that you store) and you will come to the conculsion that either you need RAID 0 (with a backup solution) or RAID 1, or RAID 5.


With RAID 0 your risk is that if ANY drive fails, you lose everything (I believe). Well, as you add drives, you greatly increase the probability that your storage system will fail... I don't know if I like that solution.


With RAID 1, your expense goes way up b/c you are only able to use 1/2 of your storage capacity.


With RAID 5, the only downside seems to be write speed. How much of a hit do you really take? Is it fast enough for recording live HD video feeds? For a DVD/MP3 storage solution, this seems to be OK as you typically are spending less time Writing data, and more time reading it (HOPEFULLY! Otherwise - what's the point!) With this solution, you only lose 1 drive to parity, so in a 3 drive RAID you have 2/3 of your capacity for storage, with 4 drives you are up to 3/4 etc. Much less of a hit.


My vote would be for a RAID 5 system.


Bob - your point as to why you should use RAID at all? Well, otherwise you have to deal with mutiple partitions for data, instead of one large partition where you store EVERYTHING.


How about other aspects of a storage system? Is 100 Mbit fast enough? Do you really want a full PC just to store data, or would you rather have a stand alone enclosure that is stand alone storage system?


Jeff
See less See more
I did a lot of research in this area and found the best solution to be:


Escalade 7810 IDE RAID card - will support 8 drives, looks like a SCSI controller to your system (uses only 1 IRQ!) and supports RAID 5. Can be purchased for less than $400 and works in Win98/ME/NT4.0/2000/XP.


8xMaxtor 160GB drives. In RAID 5, 7 are for data and one for parity. Yields 1.12 Terabyte with redundancy. Trust me, you don't want one drive going down to wipe out all this data and backing up isn't really practical. Cost is less than $1600.


That puts you out $2K for drives and controllers. It should hold about 150-240 DVDs, depending on the movies, or 130+ hours of HDTV. It will hold a bazillion CDs or something like that.


The RAID 5 issue is easy, as most of the time, we are reading, not writing. Still, for a few hundred more, a coprocessor can be put on the card to increase the speed of the RAID 5 parity calculations. Even so, I don't think that we would tax the system, as a ripping DVDs is really slowed down by the decrypting and even saving an HD feed is not that much data.


I currently have 4x160 on a motherboard RAID system. At a later date, I will be switching to an escalade card and a full stack of drives, once I get all the rest of my projects finished - or closer to completion.
See less See more
Actually the card that has raid 5 is the Escalade 7850. I don't want anyone getting the wrong card.
Ok, so let me see if I have this right. RAID level 5, in dedicating one drive to parity, can have any one drive completely fail without losing ANY data whatsoever? You just replace the drive in the array and everything is back to normal? How is this feat accomplished? It seems to me that the only way to have anything even close to reliability would be to have 100% mirroring, but I certainly am not qualified to know for sure. This factor alone would make RAID 5 very desirable, since there isn't any other good way to back up all of the hard work that went into creating this media server in the first place.

Quote:
Bob - your point as to why you should use RAID at all? Well, otherwise you have to deal with mutiple partitions for data, instead of one large partition where you store EVERYTHING.
Multiple partitions do not seem to present many hurdles, or at least not at this point. Since most DVD's can be ripped and then stripped down to ~5 to 6 gigs each (or less), there really isn't a lot of wasted space on any drive if you plan out your space judiciously. Also, software like DVD Starter can be set up to launch movies from any partition and/or directory, and assembles the titles in one big list, making it appear as if you have all your movies on one big drive. The only real limitation is that you can't cross from one drive to another in the middle of a movie, but I don't think there would be much of an issue here. As far as CD's go, they are so small, relatively speaking, that multiple partitions would never be an issue at all. The only problem I can see here would be the recording of HD material, which comes in at ~8 gigs per hour, so if you tried recording 24 hours in a row of, let's say, the Olympics coverage, then you could feasibly run out of room on a single drive before finishing your recording. Oh yeah, and if your network is so huge that you are running out of drive letters, that could be a real problem too :)
See less See more
You aren't dedicating one drive to parity, but (assuming you have 3 drives) all three drives.


D=Data (0 or 1)

P=Parity(0 or 1)

DX= Drive 1-3


D1 D2 D3

------------

D D P

D P D

P D D


The P will always equal whatever it needs to to make the addition

of the entire row = 1 (binary math).


So if 1 drive dies, the controller is able to rebuild that 1 drive by

doing the math on the bits that remain on the good drives.



Sorry for the poor explanation, but thats how it works in very broad

strokes.
See less See more
RAID 5 does in fact allow one drive to fail and the system is fine. You then replace the bad drive, and the system rebuilds the corrupted drive on the new drive.


Some (all?) controllers will allow you to put a "spare" drive in the mix that is not used unless a drive fails. Then it becomes part of the RAID automatically. This allows up to 2 drives to fail and the system will continue to function.


What's nice is when you have some external indication as to the state of each drive in the system (green = good, red = bad). Even better is when you have hot-swap drives - you don't have to turn the system off when replacing a drive. These typically are SCSI SCA type drives (standard SCSI drives with an SCA connector).


As far as multiple partitions go, it's just a matter of preference. Personally, I can't stand having multiple partitions. I like to organize my files as sub-directories. Linux allows you to overcome the partition problems by allowing you to mount partitions as sub-directories, but you still can't have 2 drives look like a single directory, so you always end up with structures like /video/drv1 and /video/drv2, better, but not great.


Jeff
See less See more
Bob,


thank you for starting this thread :) . To see how Raid 5 works have a look at the demo I posted in another thread ( http://www.microsoft.com/windows2000/demos/mod17.asp ). Although this solution is a total software based solution, it shows how 1 drive can fail and be replaced while all the data remains available.


The reason I believe that terabyte servers become interesting is the fact that storage is 'only' 60% more expensive as CDs (excluding the CD writer). This sounds still a lot, but Hard Disks 'backups' go quicker, no CD write failures and compared to CDRW, hard disks are already cheaper.


Nevertheless I see 3 potential areas for improvement;


1) Use soft-raid 5 (The 7810 is 700 Euro)


2) Hard disk prices are continously dropping, so I only want to add drives (space) when needed


3) A protection of the data on the hard disks is needed (e.g. against viruses or hackers). It may be true that for DVD's, the hard disk is already a backup, but what about recorded TV broadcasts or personal DV video's.


I don't know for HDTV, but for a maximum of 2 parallel video streams of DVD quality via a 100 MB network, I assume that soft raid would be sufficient. We're not looking to serve a hotel ;) , we just want to have a large data storage at competitive price with the best possible protection (and a good user interface for access to it).


Wykat
See less See more
A few more points on raid...


Do not confuse the features availible on high end equipemnt with all "RAID" cards. Most of the cheap cards are software raid with very little hardware help, think S3 Virge 3d acceleration.


Hotswap and auto fail-over are pretty high end features.


Rebuilding a raid 5 array takes huge amounts of dedicated time.


IDE drives are not designed to take 24/7/365 abuse.


Oh yea and don't forget RAID5+1 (mirrored raid 5) and JBOD (just a bunch of disks). I belive a JBOD formated NTFS will allow you to add a disk to an existing system and strech the partion to fit without a rebuild.
See less See more
Just FYI, any 3ware series 6000 or series 7000 escalade raid controller will do raid 5. I have a 6410 (4-drive) card in my current workstation. It has a setting to do raid-5 in its firmware. The only benefit that the 7850 (and the 4-drive 7450) cards buy you is raid-5 acceleration (data buffering and XOR in hardware, both together can make a 100-500% difference in raid-5 performance). So, if you only want a 4-drive raid-5 setup, and you don't care *too* much about write speed, the 6410 is very attractive, I think I picked mine up for about $120 a few weeks ago. Think of it as a poor-man's terabyte server...


(FWIW, I am using my 6410 in pure raid-0 mode with 4 Western Digital 120JB - 8MB cache, 7200rpm - drives and I am easily able to saturate my PCI bus - using the Intel iometer benchmark I was able to measure over 110MB/s streaming sequential read/write speeds.)
See less See more
Wow, lot's of good information here! The explanations and links provided allow even a non-techie like myself to understand how and why RAID works :)


It's looking like RAID 5 might be the best "all around" solution for the HTPC'er, as the total sum of space is reduced only by one drive within the array. Now, another question: Can various controllers be combined to form one large RAID? That is, suppose I were to buy a mobo with built in RAID, and then add two controllers that could handle 4 drives each (like Blight proposed), giving you a 12 drive RAID. Let's assume that each drive is 160 gb, giving you a total storage capacity of 12*160=1920, minus 160= 1760. Could the 3 controllers work together to form one large, cohesive RAID, such as this, or would each act indepently so that you would have 4*160=640, minus 160=480, and then multiply this times 3 controllers = 1440? As you can see, there is a significant difference in the total storage capacity of the two proposed sytems.


Another question....Do all drives have to be the same capacity within a RAID, or can different capacity drives be used together, and wht effect will this have on the RAID?

Quote:
I personally would recommend RAID 0 for most HTPC types, on the assumption that most data that would be lost should be pretty easily recoverable. After all, you DO own all those CDs and DVDs you're ripping, don't you?
I disagree that RAID 0 would be best, or at least for me. Countless hours go into the work of ripping DVD's, and then stripping them down to just the movie itself. Some of us here even go to the extent of encoding them using DivX in order to reduce the size even further. I also use Monkey's Audio lossless compression for all of my CD rips. Yes, we all own our CD's and DVD's, but have put enormous amounts of time into preparing them for use on a media server, and we would hate to have to do ALL of that work all over again :)
See less See more
1)Some RAID controllers can work together to create 1 huge array, but

it something that the card specifically has to support, and is usually

only found on higher end cards.


2)All the drives don't have to be the same size, but the array will be

limited by the lowest common denominator.


Lets say you have 4 10g drives and 1 20g drive... Total storage

will be 40Gb.. The 20Gb drive will be used as a 10Gb so you lose

10 there, and then 10Gb would go to the parity of the array.
See less See more
One thing you have to think about is what are you trying to achieve? Do you need more space in one filesystem than what you can get with one drive, do you need higher io rates, do you need redundancy, or a combination of the three. At my work, redundancy is require and we either use striping+mirroring, or RAID5 depending on the intended usage. Sometimes we use hardware and sometimes we use plain old software raid in Win2k or unix.


By going to an array setup, you also lose your current ability to pull a single drive and move it somewhere else.


Dave
Using non-same-sized drives would be counter productive as ShockValue states. You must commit to a disk size in advance and stick to it for the array. I have done this with the 160G Maxtor for the first. I am hoping that I can do better (320G?) with the second. I am further hoping that blue light lasers or some other tech will come along in the next year to help me.


I posted the following in the thread http://www.avsforum.com/avs-vb/showt...7&pagenumber=3 and am posting it here as I believe Bob has a very good point - this should be an open discussion and not targeted to someone who is trying to help those who want a solution without possible problems:


That is a good demo (at http://www.microsoft.com/windows2000/demos/mod17.asp ) of software raid and repair. Please notice how long it takes to copy their 1G folder from the Z: dirve to the striped volume (S. The copy box reports 7 minutes remaining after running for ~1 min. This means the copy will take ~8min. I am assuming these disks are IDE (SCSI should be faster than this is UW). I am also assuming that, since there 8 physical drives, these drives are apportioned amoung 4 IDE channels. Looking at how the disk manager works, I would also assume that the copy is taking place from one channel to another. There is really no way to discern if master or slave.


The point of all of this is to show that 1G will take ~8min. By extension, 8G will take 64min - a little over what is required for streaming HD (8G per hour). Whatever they are using for their extra IDE channels is not sufficient to stream HD to more than one other computer (this copy is excercising 2 [logical] drives). Therefore, software RAID (as implemented by Win2K) is not fast enough to stream HD to more than one other computer.


Am I wrong somewhere?


I also promised some tests of streaming HD or DVD video to more than one computer and will post my results in this thread.
See less See more
This is an excellent discussion on archiving large amounts of data. I personally would not want to use 6 or 8 single drives with 6 to 8 drive letters. Just trying to find something would be too much work. Of course, using NTFS you can SPAN the drives so that you can accumulate a bunch of drives, just like JBOD (Just a Bunch Of Disks) and have all 8 drives as a single volume.


However, backing up such a beast would be could be very costly, instead a Raid 5 configuration would give you redudancy at the cost of one drive for parity and much poorer write speeds due to XOR calculations. Write speeds can be the same or lower than using a single drive.


There's a number of Raid 5 IDE and SCSI cards, Adaptec 2400A and 3ware's Escalade's 6400, 7410, and 7450.


Check out storagereview.com's website for reviews on ALL of these Raid controllers. here's a link for just one of the reviews (it's a long single page version with lot's of graphs)...

Storagereview.com


I have not used the escalade RAID controllers, but they seem to be highly recommended for raid 5. Since this is for a media server sequential read/write would be more important than random i/o. For best performance your boot drive should be separate from the storage array.


The cons to this setup would be the noise and heat generated by all the drives, especially if your HTPC is in the same room. If it's going to be located in another room, then it's no big deal. Otherwise a separate storage network would be better with a beefed up power supply (aka loud) to handle the load of 8+ drives powering up at once, I don't know if the Escalade or other controllers can delay the startup of drives like SCSI can.
See less See more
I forgot to add that other EIDE/ATA Raid 5 controllers are 3ware's Escalade 7810, 7850 and the Promsie SuperTrak Series. If anyone knows of any others, please let us know...
Drives are getting big enough that drive drive swaping will be the new thing.
1 - 20 of 101 Posts
Status
Not open for further replies.
Top