For spinny disks, I replace drives whenever there are any reallocated sectors (RAW reallocated sector count is nonzero). This includes brand-new drives. (And I don't trust Seagate drives, even as paperweights.)
...
In about half the cases you see reallocated sectors increasing a few hours (of use) before the drive dies completely, but then again, most drives do perfectly fine with a small number (say, up to a dozen) reallocated sectors – and many have a few reallocated sectors off the factory.
My experience is that until the wear-out part of their operating life, about half of the drives degrade quickly after the first couple of new bad sectors. The others just continue to operate like normal.
Which means that no RAID or disk monitoring is as good as backups and keeping your important data on multiple physically separate media.
I currently have a 5 x 2TB RAID6 on my test system made up with all of my spare 2TB drives and one of them, a Samsung HD204UI, keeps returning bad sectors without marking any bad, and the RAID just handles it, which is a nice confidence booster that the RAID controller is working properly.
I recently moved my "portable" data to an external SSD (1) and I maintain an offline backup on a duplicate SSD. Periodically, like every couple days, I hash the entire contents, update the offline backup, and then check the hashes on the offline backup. The hash generation and checking also serves to read the entire contents of each SSD giving them the maximum opportunity to scrub a bad sector. With USB 3.0, the external 250GB SSDs are fast enough to use directly and the backup procedure only takes a few minutes.
Do not all manufacturers have automatic relocation and isolation of defective sectors?
Yes, they do.
How it works in real life, is that when the drive detects that it failed to correctly write a sector,
nope. drives do not detect that. writing is blind , the read heads are off. verification would require waiting an entire revolution to read the sector and compare. this is not done by the drive. it is up to the operating system to do this.
That is right; the drive does not know the sector is bad until it is read later, at which point it marks it as "pending reallocation". The reallocation then happens when the sector is written which indicates to the drive that the data is no longer needed. Until then, it will make a best effort to recover the data every time it is read.
If you are running RAID, then when a bad sector is detected, the RAID should regenerate the data and write it back to the drive to force reallocation. RAID controllers usually support periodic scrubbing where all data is read to look for bad sectors so they can be scrubbed.
What is the truth? HDDs have or not in the fiirmware the automatic function to relocate and isolate bad sectors badblocks? only server hdds? and the year 2000 hdds?
As far as I know, every hard drive with an integrated controller supports automatic reallocation as described above.
Actually, every hard drive made since we moved from plain old IDE to ATA employs ECC (and it was common even before then). No exceptions. All data on a hard drive is ECC protected, and the drive will know if there's an error (the idea behind "bit rot"). And unless the bit error happens to be in a defective sector *and* the drive has exhausted its spares then the drive will correct the error the next time the sector is read.
That would be scrub-on-read versus scrub-on-write which we have been discussing. I personally have never seen a scrub-on-read on a hard drive, but good SSDs are suppose to do this. (2) On a hard drive scrub-on-read presents practical difficulties because typically no verification pass is done.
(1) An external SSD in an enclosure is my replacement for USB Flash sticks which I consider to be too unreliable. Maybe somebody makes a reliable USB Flash stick, but I never found it.
(2) SSDs which perform idle time scrubbing *must* also support power loss protection because program and erase cycles can occur at any time, which implies that SSDs which do not support power loss protection, do not perform idle time scrubbing.