Author Topic: Synology NAS Western Digital RED Hard Drive FAIL  (Read 7741 times)

0 Members and 1 Guest are viewing this topic.

Offline Ed.Kloonk

  • Super Contributor
  • ***
  • Posts: 4000
  • Country: au
  • Cat video aficionado
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #50 on: June 08, 2021, 07:41:34 am »
None of the sound you recorded was platter contact, just the servo going nuts.

Yes, it seems to have recovered somewhat in terms of noise, but it's still way louder than a normal drive. The noise was horrendous though at the time. Maybe it was end stop slamming or something.

Orientation. Drive manufactures swear blind it doesn't matter. And they might be right, when the drive is healthy.

iratus parum formica
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37740
  • Country: au
    • EEVblog
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #51 on: June 08, 2021, 08:01:16 am »
None of the sound you recorded was platter contact, just the servo going nuts.

Yes, it seems to have recovered somewhat in terms of noise, but it's still way louder than a normal drive. The noise was horrendous though at the time. Maybe it was end stop slamming or something.

Orientation. Drive manufactures swear blind it doesn't matter. And they might be right, when the drive is healthy.

That thought did cross my mind.
 

Online Monkeh

  • Super Contributor
  • ***
  • Posts: 7992
  • Country: gb
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #52 on: June 08, 2021, 01:17:02 pm »
The little bit of slop on your driver there is to allow you to get it in and out of the screw. I assure you the ones used in the factory are of much higher quality.

None of the sound you recorded was platter contact, just the servo going nuts. I'm guessing head failure. If it had actually crashed the heads would have been torn to shreds. It might have jammed them up on the spacers, though.

I was wondering about this also. My limited experience of failmodes of WD reds is watching the SMART info from a couple of raids I have and yeah, just listening like Scotty of the Enterprise. When drives went from LBA to whatever voodoo they do now, I remember being told to watch out for drives writing just as a power outage occurs. Apparently, that is drive doom. The head touches the platter for a nano second. No obvious visual evidence. But the damage is done. And you can't even trust the drive reporting anymore since WD knowingly pissed on the WD red brand with the whole SMR bullshit.

Makes no difference whether it's reading ot writing, or doing nothing at all. If the power fails in operation the heads should park by themselves long before the platters slow enough cause contact.

SMART data has never been trustworthy. All manufacturers have bugs or intentional omissions, and none publically document the meaning of various values.
 

Offline madires

  • Super Contributor
  • ***
  • Posts: 7765
  • Country: de
  • A qualified hobbyist ;)
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #53 on: June 08, 2021, 06:27:20 pm »
IIRC, Google published statistics about HDD failures a few years ago and in about 50% of all cases SMART was able to predict that a HDD will fail.
 

Offline BradC

  • Super Contributor
  • ***
  • Posts: 2106
  • Country: au
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #54 on: June 09, 2021, 12:05:01 am »
IIRC, Google published statistics about HDD failures a few years ago and in about 50% of all cases SMART was able to predict that a HDD will fail.

True, but that wasn’t SMART saying “this drive is about to fail”, it was looking at the stats for an array of identical drives and going “this one is anomalous”. So the failure was predicted by generating statistics across a large population. 50% is still in “flip a coin” territory though. It can help with a proactive “should probably replace this one before it fails” routine, but that’s about it.

In my experience by the time SMART itself reports “failure is imminent” the drive is already in severe distress.
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8275
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #55 on: June 09, 2021, 02:00:22 am »
The best predictor of failure is the number of reallocated sectors and "pending sectors". If they go above zero it's time to replace the drive.
 

Offline james_s

  • Super Contributor
  • ***
  • Posts: 21611
  • Country: us
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #56 on: June 09, 2021, 05:16:40 am »
SMART seems like a good idea, but in practice I don't think it has ever told me anything useful. The few drives I've had fail just completely failed with no warning.
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37740
  • Country: au
    • EEVblog
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #57 on: June 09, 2021, 09:52:29 am »
Both drives turned up today, just started the sync process, was 1% after 10 minutes.
Using the normal non-priority resync mode. There is an option for faster, but no point stressing the drives I guess.
It uses very little resources on the NAS, 24% CPU and 13% RAM
 

Offline rsjsouza

  • Super Contributor
  • ***
  • Posts: 5986
  • Country: us
  • Eternally curious
    • Vbe - vídeo blog eletrônico
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #58 on: June 09, 2021, 10:15:51 am »
SMART seems like a good idea, but in practice I don't think it has ever told me anything useful. The few drives I've had fail just completely failed with no warning.
Same here. I suspect that slow developing issues might be propely flagged, but catastrophic failures are completely off game.
Vbe - vídeo blog eletrônico http://videos.vbeletronico.com

Oh, the "whys" of the datasheets... The information is there not to be an axiomatic truth, but instead each speck of data must be slowly inhaled while carefully performing a deep search inside oneself to find the true metaphysical sense...
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 6779
  • Country: pl
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #59 on: June 09, 2021, 02:50:20 pm »
The best predictor of failure is the number of reallocated sectors and "pending sectors". If they go above zero it's time to replace the drive.
That's not a predictor but an indicator of failure :P
And it may be one-off bit rot. I have a disk which still works after reporting one "pending sector" many years ago. Sector overwritten with new data, problem gone.
 

Online Monkeh

  • Super Contributor
  • ***
  • Posts: 7992
  • Country: gb
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #60 on: June 09, 2021, 02:55:51 pm »
A handful of failed and reallocated sectors is normal and harmless. The drives have spare sectors for a reason.

A rapidly growing count is another matter entirely.
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37740
  • Country: au
    • EEVblog
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #61 on: June 10, 2021, 02:14:58 am »
Rumors of the death of all the drives in my NAS during resync was greatly exaggerated.
NAS back up and running with a replacement CMR Red Plus drive.

BTW, max drive temp during re-sync was only 2degC above normal idle temp.
And the EFAX SMR drive is only at 15000 hours, so I might just leave it in place.
 

Offline BradC

  • Super Contributor
  • ***
  • Posts: 2106
  • Country: au
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #62 on: June 10, 2021, 03:01:23 am »
And the EFAX SMR drive is only at 15000 hours, so I might just leave it in place.

As already stated, the EFAX drives suck at large random write loads. Rebuilding another drive is a large sequential read load, and they are fine for that. Due to the increase in platter density, they may even be faster than a CMR drive. It's when they need to write randomly they have to "cache" the writes, then when they get some time they re-write the affected zone from the write start point until the end. A bit like an SSD needs to do a read/erase/write on an entire block to change a single "sector" (which is what clever wear leveling works around). SMR needs to re-write everything in the "zone" (think flash erase block size) after the desired write. If the drive knows it's a complete sequential write then it can start at the start and power on through. If it's not sure though it's going to try and write to the "cache" first and then do the proper re-organisation in its idle time.

The issue with SMR drives is in re-build due to the sustained write load. If the NAS vendor has worked with the drive manufacturer to ensure they write in a manner the drive can cope with then it's all ok, provided you aren't doing huge writes to the array while it's rebuilding. They are still considerably slower than CMR drives when you fill up their "cache". Just like TLC/QLC drives are when you fill up the SLC "cache".

 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37740
  • Country: au
    • EEVblog
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #63 on: June 10, 2021, 03:10:06 am »
And the EFAX SMR drive is only at 15000 hours, so I might just leave it in place.

As already stated, the EFAX drives suck at large random write loads. Rebuilding another drive is a large sequential read load, and they are fine for that. Due to the increase in platter density, they may even be faster than a CMR drive. It's when they need to write randomly they have to "cache" the writes, then when they get some time they re-write the affected zone from the write start point until the end. A bit like an SSD needs to do a read/erase/write on an entire block to change a single "sector" (which is what clever wear leveling works around). SMR needs to re-write everything in the "zone" (think flash erase block size) after the desired write. If the drive knows it's a complete sequential write then it can start at the start and power on through. If it's not sure though it's going to try and write to the "cache" first and then do the proper re-organisation in its idle time.

The issue with SMR drives is in re-build due to the sustained write load. If the NAS vendor has worked with the drive manufacturer to ensure they write in a manner the drive can cope with then it's all ok, provided you aren't doing huge writes to the array while it's rebuilding. They are still considerably slower than CMR drives when you fill up their "cache". Just like TLC/QLC drives are when you fill up the SLC "cache".

Sure, but why replace it after 15,000 hours if all the SMART tests are fine, zero bad sectors, and I have zero performance issues with it?
If anything, based on the fact that one of the 3 x 26,000 hour CMR drives failed, I should be replacing those instead of the SMR one.
 

Offline BradC

  • Super Contributor
  • ***
  • Posts: 2106
  • Country: au
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #64 on: June 10, 2021, 04:24:00 am »
Sure, but why replace it after 15,000 hours if all the SMART tests are fine, zero bad sectors, and I have zero performance issues with it?
If anything, based on the fact that one of the 3 x 26,000 hour CMR drives failed, I should be replacing those instead of the SMR one.

I wasn't implying you replace it. I was just giving a bit of background on the difference between the two. Far out, it's barely run in. I wouldn't replace the CMR drives either. I have a small array of those 6TB spinners with 36k hours on them. Like most drives, keep them spinning with heads loaded and they'll keep going.

  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36033
  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36023
  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36023
  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36023
  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36023
  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36031
  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36023
  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36022

I have another "slightly older" set of WD 2TB drives supplemented with some "scraps" I had lying around :
  9 Power_On_Hours          0x0032   071   071   000    Old_age   Always       -       21560
  9 Power_On_Hours          0x0032   019   019   000    Old_age   Always       -       59698
  9 Power_On_Hours          0x0032   028   028   000    Old_age   Always       -       52862
  9 Power_On_Hours          0x0032   019   019   000    Old_age   Always       -       59698
  9 Power_On_Hours          0x0032   019   019   000    Old_age   Always       -       59701
  9 Power_On_Hours          0x0032   019   019   000    Old_age   Always       -       59729
  9 Power_On_Hours          0x0032   040   040   000    Old_age   Always       -       44217
  9 Power_On_Hours          0x0032   019   019   000    Old_age   Always       -       59698
  9 Power_On_Hours          0x0032   019   019   000    Old_age   Always       -       59697

As long as they keep going, just use them until they die. I'd avoid buying any new SMR drives though.
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37740
  • Country: au
    • EEVblog
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #65 on: June 10, 2021, 01:00:52 pm »
As long as they keep going, just use them until they die. I'd avoid buying any new SMR drives though.

That's the plan. Never knew an SMR went into there to begin with, now I know wouldn't buy another one.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14481
  • Country: fr
Re: Synology NAS Western Digital RED Hard Drive FAIL
« Reply #66 on: June 10, 2021, 05:20:27 pm »
SMART seems like a good idea, but in practice I don't think it has ever told me anything useful. The few drives I've had fail just completely failed with no warning.

I guess the answer is: it depends.
The only failures I've had in about 20 years were those WD Green drives, and in this case, the SMART data showed a definite problem. The number of errors kept growing, and they began to be pretty slow, but they were still readable, so swapping them for new drives in my RAID setting was no problem, without having to use a backup.
The other HDD failure I had was over 20 years ago with a Seagate drive, and at the time, AFAIR, it didn't implement SMART.

 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf