Go Back   Hardware Canucks > HARDWARE > Storage

    
Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old April 3, 2016, 02:52 PM
Hall Of Fame
F@H
 
Join Date: Nov 2008
Location: Ottawa, ON
Posts: 1,303

My System Specs

Question Assessing the damage to my RAID-is-not-backup fail

Hello all,

I got a RAID 1 array that had 2 of 3 disks dropped (one by me, one automatically). Can I still assume the data on the one remaining disk is OK, or should I assume it's garbage too and go to my backup of the array?

------------------------

Story:

I used to have 3x WD Caviar Green 1.5 TB drives in an array. Yesterday morning, I dropped one of the drives because its Offline Uncorrectable Sectors count rose from 7 (back when I had this post) to 22.
After dropping and removing the drive (now down to 2 drives), I started copying stuff off from it to other disks / the cloud. (Hint: lots of reads)

During the copy operations, I get this:

Code:
It could be related to component device /dev/sdd1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1]
md0 : active raid1 sdc[3] sdd1[2](F)
      1463735296 blocks super 1.2 [3/1] [U__]
      bitmap: 11/11 pages [44KB], 65536KB chunk
Looking at the logs, I also got hundreds of these:

Code:
Apr  3 13:55:42 terbium kernel: [58792.180859] sd 7:0:0:0: [sdd] Unhandled error code
Apr  3 13:55:42 terbium kernel: [58792.180859] sd 7:0:0:0: [sdd]
Apr  3 13:55:42 terbium kernel: [58792.180860] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
------------------------

Prime example of RAID is not backup

Well it's safe to say I'm AT LEAST down to one drive from three. Do I assume the one remaining disk is still fine (for now) and copy that somewhere to continue using it as-is, or do I assume the data on the used-to-be array is garbage and do an annoying restore from the month old backup of the array?
__________________
"The computer programmer says they should drive the car around the block and see if the tire fixes itself." [src]
Reply With Quote
  #2 (permalink)  
Old April 4, 2016, 08:26 AM
"Quote This..."
F@H
 
Join Date: Nov 2007
Location: Hell
Posts: 3,922
Default

You mean raid 0? Raid 1 is mirrored and not possible with 3 disks.
Reply With Quote
  #3 (permalink)  
Old April 4, 2016, 10:54 AM
Coach's Avatar
Allstar
 
Join Date: Mar 2007
Location: Morden MB
Posts: 774

My System Specs

Default

Why not raid 1 with three disks? Won't they all have the same data written to all three?
Reply With Quote
  #4 (permalink)  
Old April 4, 2016, 02:35 PM
Hall Of Fame
F@H
 
Join Date: Nov 2008
Location: Ottawa, ON
Posts: 1,303

My System Specs

Default

It was RAID 1 with 3 disks:
https://raid.wiki.kernel.org/index.p...D_setup#RAID-1

Past the second disk, any additional disk can be set as a spare or as part of the array to be another mirror.
__________________
"The computer programmer says they should drive the car around the block and see if the tire fixes itself." [src]
Reply With Quote
  #5 (permalink)  
Old April 4, 2016, 04:38 PM
AkG's Avatar
AkG AkG is offline
Hardware Canucks Reviewer
 
Join Date: Oct 2007
Posts: 4,674
Default

Personally.... I say it depends. Most likely your data on your one remaining drive is fine as it was 1/1/1 with all data cloned to each drive. NOW this assumes your controller is A) not stupid and B) not running on a OS that was designed by idjiots. IF a or b assumption is incorrect your data is toast. Personally I'd have dropped the NAS as soon as the second drive failed and not have turned it back on until a replacement drive was in place. That way when the NAS panics (and it will panic)... it sees the empty 'good' drive and doesnt have a melt down. (yes this is anthropomorphising what can happen... but you get the gist).... plus I would not trust the data to be readable on any other NAS except that one as once again each NAS os does things slightly different when formatting and the such... and it could toast an otherwise good drive filled with good data.

IMHO its an odd setup as your writes must have sucked (but your reads were prolly damn good).

YMMV
__________________
"If you ever start taking things too seriously, just remember that we are talking monkeys on an organic spaceship flying through the universe." -JR

“if your opponent has a conscience, then follow Gandhi. But if you enemy has no conscience, like Hitler, then follow Bonhoeffer.” - Dr. MLK jr
Reply With Quote
  #6 (permalink)  
Old April 4, 2016, 05:07 PM
Hall Of Fame
F@H
 
Join Date: Nov 2008
Location: Ottawa, ON
Posts: 1,303

My System Specs

Default

Thanks for the notes!
I managed to copy the remaining important stuff without any more complaints from the system, added another drive, and it's rebuilding now. I'll post again if it craps out to finish the story.

The home server NAS was a game to me. I had some hardware lying around and wanted to play around to see what configurations I can get from it. I however made the mistake of keeping the experiment around and then putting data that I'd be annoyed to lose on it

I want to replace this thing with the right stuff e.g. ECC RAM, NAS or enterprise disks, etc., but I don't feel like getting more Intel hardware to do it! I already have a Celeron, i3, i5, and mobile i7 sitting around here.

So far I've had no luck on finding cheap + low power ARM or AMD hardware to play with. (Yes there's buying used Opterons but that's overkill.)
__________________
"The computer programmer says they should drive the car around the block and see if the tire fixes itself." [src]
Reply With Quote
  #7 (permalink)  
Old April 5, 2016, 11:42 AM
MARSTG's Avatar
Hall Of Fame
F@H
 
Join Date: Apr 2011
Location: Montreal
Posts: 2,629

My System Specs

Default

I would say the wd greens are slowly dying their usual horrible deaths but data should be fine. From the logs i assume the OS is not Windows?
__________________
Reply With Quote
  #8 (permalink)  
Old April 5, 2016, 02:24 PM
Hall Of Fame
F@H
 
Join Date: Nov 2008
Location: Ottawa, ON
Posts: 1,303

My System Specs

Default

Hehe I like the adjectives / adverbs. The OS is Debian Jessie.
The array rebuilt without errors BTW, although I obviously am not going to trust anything to it anymore.

The drive that had the errors oddly isn't showing any reallocated sectors and it survived a security wipe without bad sectors attached to another computer, but it logged a UltraDMA CRC Error Count of 614 which may mean borked controller or cable. (I rebuilt the RAID1 on different ports attached to a different controller on the motherboard.)
__________________
"The computer programmer says they should drive the car around the block and see if the tire fixes itself." [src]

Last edited by frontier204; April 5, 2016 at 07:26 PM. Reason: apparently I don't know what distro version I'm using ... not like it matters much with Debian
Reply With Quote
Reply


Thread Tools
Display Modes

Similar Threads
Thread Thread Starter Forum Replies Last Post
cpu is ok or damage.... demon08 CPU's and Motherboards 9 August 27, 2013 09:00 AM
Best backup solution HDD (non raid) Masteroderus Storage 14 May 7, 2013 07:10 PM
cp fedex puro ups fail + canada computers fail _dangtx_ Off Topic 21 May 22, 2011 12:16 PM
CPU Damage, kazman CPU's and Motherboards 17 March 28, 2011 03:03 PM
how to damage a cpu? magictorch Overclocking, Tweaking and Benchmarking 8 July 28, 2007 08:31 PM