or Connect
AVS › AVS Forum › Video Components › Home Theater Computers › So now I get to find out if I can trust Flexraid...
New Posts  All Forums:Forum Nav:

So now I get to find out if I can trust Flexraid...

post #1 of 15
Thread Starter 

Here's an interesting one for you Flexraid gurus.

 

The parity update had failed this morning.

 

Two DRU drives have gone MISSING. At the same time. How likely is that? Either they really died or the Norco backplane is borked (the other two drives in the row work fine though). 

 

While waiting for replacements I thought I'd restore to two identical drives which are in the pool but 100% unused. So I figured I could remove the empty drives from the pool and then restore to them. Only the remove fails (same result regardless of which drive I try to remove).


 

Quote:

The [<[update]>] task has successfully initiated...

Process number: 2

An exception has occurred! See logs for full message.

Error message: DRU4 has failed or is offline! - Please bring it online or recover it...

 

Name: Metadata Scanner Process

Start Date: Thu Jan 30 08:30:58 GMT+100 2014

End Date: Thu Jan 30 08:30:59 GMT+100 2014

Duration: 00:00:00 Throughput: N/A

 

 

DRU4 is one of the failed/missing drives. The pool is all 3 TB drives, 8 in all of which 2 are PPUs.

 

It looks like Flexraid requires an update be performed before it will remove a drive. Which obviously fails because of the two missing drives.

 

How can I now get the two empty drives out of the pool and restore to them? What will happen if I just yank the empty drives so they go "MISSING", re-format them and then restore to them? That ought to result in the two originally failed drives being restored and the two others (which had no data on them) being missing until I can get two new drives. Will that work or will the whole thing go south when Flexraid discovers FOUR missing drives?

post #2 of 15
Thread Starter 

Update: Moved the two failed drives (both WD Green :confused:) to two other bays in the Norco. They came up in the Windows disk manager - but with no volumes on them. Windows  said they had 1.27GB free - when in the Flexraid pool they were 100% full...

 

The WD diagnostics tool didn't see them at all...

 

So I reset one of the disks, created a new volume and I am now restoring to it in Flexraid. Currently running at 72 MB/s so it will take a while before I find out if I can trust the restore...

 

So something caused the disks to become FUBARed while in the original bays (on motherboard ports) and go missing from Flexraid (and Windows). Then when put in two different bays (on IBM M1015 ports) they came up, but with the volumes gone, and I was able to reset and re-format them.

 

What does this suggest? Bad Norco backplane (which I don't like thinking about at all), bad disks, or software related?

post #3 of 15
I had the same problem with a WD GREEN, I swapped it out long ago and it was one of the first issues I had with those drives. I never had two at the same time though. That makes me think something else could be wrong, otherwise you got bad luck.

When my most recent green failed I could not remove it either. I had the same issue you are having, flexraid can't remove a failed drive. I ended up just starting over, I rebuilt my parity and upgraded my parity drive in the process.

It's working great now. All 38TB. And the WD GREENS I removed. I ended up using only 1 3TB GREEN in my pool and none of the 2 TB models. At some point I'll likely replace that 3TB soon.

I don't trust then at all something is just wrong with those drives and how I use them. I thought since it's not hardware RAID the GREENS might be a nice choice and everyone here was recommending them at the time I bought them. Live and learn the hard way I guess.

I never found a solution to your problem, I found it just easier to start over (the data on the failed drive wasn't important to me) . Had it been critical data I probably would have gone deeper into a proper solution.

I'll be following along because I'm curious about this. You might want to start a thread at the flexraid forum or PM Brahim here with a link for some help.
post #4 of 15
If you swap from a mobo port to a IBM port the drive should still work fine and be included in your pool.

That part was odd too. Did you mess around with the drive at all before moving to another bay ? Flexraid unmounts the HDD and removed the drive letter when you add it to the pool.

When I deleted my flexraid config totally all my drives showed back up (including failing one) ... I could read from the bad drive but it always entered chk disk on start up and had errors. It wasn't totally a brick do I did save some data.
post #5 of 15
Thread Starter 

No, I didn't do anything with the drives. Flexraid reported them as missing. They were gone from the WSE2012 disk manager too. WD's diagnostic tool did not see them. Reboot didn't help. 

 

I then inserted them in two other bays (on the IBM ports). They showed up in  the WSE2012 disk manager with 1.27 TB (out of the normal 2.73) free capacity but unformatted (NTFS volume gone) even though they had been 100% full in Flexraid. The WD tool would not read them.

 

So both drives died at the same time with identical symptoms. 

 

I don't know how the Norco backplanes are wired, maybe two of the four bays failed as a pair at the same time and caused garbage to be written to the drives. Or both drives had an identical, temporary glitch. Pretty weird.

 

The drive that I reset and reformatted is now 60% into the rebuild which looks to be going fine. So it appears the damage wasn't permanent. But can I trust it, even if the rebuild completes without errors?

post #6 of 15
I think you can trust Flexraid for a restore of however many failed drives you get as long as you have enough parity drives to provide that. Restoring two hard drives doesn't seem like a big deal, especially if you restore both and lose nothing. I'd say that should provide enough trust to continue, but your issue is strange and you might want to explore a reason for what happened further. Because if your Norco backplane was to blame then you might want to swap that out before it's an entire row of hard drives next time.

But again, I have no clue if that is really the case or not so.. IDK ??
post #7 of 15
Thread Starter 
Starting to look like a backplane problem. I relocated two Seagate 2TB drives from another machine into the two bays where the two drives which went missing had been. Formatted them first in Windows, one of them took much longer than normal but the volumes were created OK.

Tried copying a 10GB movie to one of them. Ran at 150 MB/s until 80% was done, then froze completely and pretty much hung the entire server. Had to reboot.

After reboot, used AMD RaidXpert to initialize both disks and tried to create a Raid 1 array. Reached about .5% and then failed with a "drive error" and the drives offline.

So I have two bays in the Norco which do strange things to hard drives. The bays are not dead, they seem to work for a while and then seize up.

Very much doubt it's the mobo SATA ports or the PSU since all the others work fine. Desperate for suggestions as to what this is!
post #8 of 15
I would think loose sas8087 cable or bad backplane.

Probably swap out the backplane?
post #9 of 15
Thread Starter 
I'm getting real worried now. The Samsung SSD that I posted about in Mfusick's server thread and which failed in my RAID-1 O/S mirror probably fell victim to a flaky backplane as well, the symptoms were similar.

I sent off an email to the company in the Netherlands from which I purchased the Norco chassis. They have not responded and from their website it seems they no longer carry Norco. Maybe because of too many QA problems...

So I may have a problem with getting warranty replacement of the backplanes.

This is a server I built for a client so I would hate for the backplanes to cr*p out.

Maybe I should have gone with Supermicro...
post #10 of 15
You runs your SSD off the SAS backplane ?? And RAID CARD?

I would think off the mobo port makes more sense for SATA3 and SMART MONITORING and also SPEEDIER BOOT UP ??

PLUS you can pass TRIM over the MOBO port (not sure about the controller card) so that might be better too.
post #11 of 15
Thread Starter 
The SSDs run on motherboard ports, but yes they are in hotswap bays. That's kinda the point in having a hotswap case...
post #12 of 15
I put mine on the top shelf because ill never need to hot swap my OS drive (although you have RAID)

When my SSD died (it actually did) I used a Phillips screw driver. I then clean installed to a new SSD.

I can't say my way is better though tongue.gif
post #13 of 15
Quote:
Originally Posted by politby View Post

Starting to look like a backplane problem. I relocated two Seagate 2TB drives from another machine into the two bays where the two drives which went missing had been. Formatted them first in Windows, one of them took much longer than normal but the volumes were created OK.

Tried copying a 10GB movie to one of them. Ran at 150 MB/s until 80% was done, then froze completely and pretty much hung the entire server. Had to reboot.

After reboot, used AMD RaidXpert to initialize both disks and tried to create a Raid 1 array. Reached about .5% and then failed with a "drive error" and the drives offline.

So I have two bays in the Norco which do strange things to hard drives. The bays are not dead, they seem to work for a while and then seize up.

Very much doubt it's the mobo SATA ports or the PSU since all the others work fine. Desperate for suggestions as to what this is!

I would relocate the reverse breakout cable to another row and retest to rule out the cable it self. Hardest thing about the break out cable is trying to figure out which cable goes to which bay.
post #14 of 15
Thread Starter 
Guess I might have to do that. Royal pain in the rear...

But just an hour ago the second drive in the SSD raid-1 O/S array went offline again. Exactly the same symptom as last week.

So now I have one flaky bay in the top row and two in the second. Both rows are fed by breakout cables from the mobo's 8 ports.

The two lower rows connect via straight 8087 cables to an IBM M1015.

So if the problem is cable related I have TWO problematic ones.

TBH I am pretty unhappy with the Norco lack of QA. The expansion slots on the case were so badly done that I had to use a hammer and chisel to enlarge the notches the lower tabs of the expansion cards' brackets push into. Almost cracked the mobo.

Strongly considering ditching the Norco and picking up one of these:

http://www.ebay.de/itm/390749809858?ssPageName=STRK:MEWAX:IT&_trksid=p3984.m1423.l2648

Either run it as it is - wonder if it will perform as good as the AMD A8-6600K which is in the Norco now - or swap the components.

Could probably add a second Xeon for almost nothing.

This Supermicro has a 16-port SATA backplane so I'd have to get 8 standard SATA cables and two forward breakout ones.

That SM server looks like a great deal.
post #15 of 15
Nice looking case. That seems cheap ^
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Home Theater Computers
AVS › AVS Forum › Video Components › Home Theater Computers › So now I get to find out if I can trust Flexraid...