Quote:
Originally Posted by erik71 
I disagree that it is inevitable that people will have trouble with the size of the parity file. As long as the typical file size is several times the block_size, then the parity file will be about the same size as the data on the drive with the most data (a little larger: wasted space will be approximately equal to half the block_size times the number of files).

I disagree that it is inevitable that people will have trouble with the size of the parity file. As long as the typical file size is several times the block_size, then the parity file will be about the same size as the data on the drive with the most data (a little larger: wasted space will be approximately equal to half the block_size times the number of files).
Presumably though (and I don't have my test laptop with me today so can't poke at it) the slack space wasted by using a 5 kilobyte file in the 256 kilobyte block size will be as bad as storing a 261 kilobyte file?
You will completely fill one block within snapraid but the next block will still be largely wasted.
This clearly isn't as bad as the worst case scenario I instigated with my 5k file tests which will be wasting space in *every* block but it does mean the more files you have, so long as they don't fall perfectly on a snapraid block boundary (unlikely?), the more wasted space you'll inevitably have. This is backed up in the snapraid manual
Quote:
The more drives and more storage you're trying to protect - the more files you will likely have.
So even with large files you will eventually hit a problem.
As I said my real concern is that I can't just say 1:1 in terms of largestdatadrivesize:snapraidparityspacerequired in fact I can guarantee that the parity space required will always be more than the largest drive of data you have. And that's before we discuss the space required for the content file.
Even if 1:1 data to parity space holds true at some point in your arrays life it may not hold true once you shove another 3 or 4 data drives in and fill them with data and update parity.
Quote:
My reasons for using snapraid are to try and remove the need for traditional raid and have protection with the minimum amount of overhead possible. So having to worry about additional protection for the content file (particularly in any sort of raid array) doesn't really suit me.
For reference I'm coming from unraid (where your parity drive is the same size as your largest data drive though unraid obviously works at the block level meaning you're ducking any of the issues snapraid is presenting) and looking at either flexraid or snapraid as potential replacements.
I've done minimal testing of flexraid but my first impression was it doesn't suffer from this problem despite also using file based protection - but I wouldn't like to say for sure

I'd agree it's looking like snapraid may not be for me, though I would suggest I am perfectly in the target market for it having a mid sized array mostly of large media files but also with a scattering of smaller files (metadata files, general day to day documents, photos, ebooks / papers etc). But I'm not sure how scalable it will be if you're protecting a large amount of data / files.
The easy answer, as I've mentioned on this thread previously, is to just try it against my existing data but would require me to buy an additional drive. It's also making spec'ing a new storage system difficult as I have no idea how much storage I'll need for parity.
Perhaps people on this thread running snapraid in anger could report back on their total file count and data size against their snapraid parity and content file sizes?
Thanks for the discussion though - I'm going to drop the snapraid block size to match the filesystem (which for ext4 is 4k by default I think) and see what the results look like. The sums suggest I'll need 3G ram to handle that - which is fine.















Can you zip the log file up with the relevant information for me please, and create an issue @ "




