NOTE Everything here assumes that we are talking about spinning rust, not SSD! While most of this post is generally valid for SSD, in some information may need adjustement.
But this time a thought hit me: With all that intelligence in the drive firmware, it re-allocates bad sector all by itself... Doesn't that make long format and surface scan a waste of time?
This is a very good thought. Indeed yes: nowadays firmware is reallocating damaged sectors. If it would occur that any of software scans detects bad sectors, it means that the drive is so horribly damaged that firmware has run out of sectors to reassign (normally each drive has a bunch of those reserved for this purpose, not accessible from the computer’s side).
You can see the information about reallocated sectors in your disk’s SMART data. And this is one of the very few SMART values that typically are meaningful in their raw form, indicating the actual number of reallocated sectors across any vendor I’ve heard of. You should observe that value, because if it grows from 0 (or the normalized value falls), it is a clear sign that you should start seeking a new drive and treat this one as soon becoming unreliable.
Under Linux you can check SMART data using
smartctl from
smartmontools (available as a package in many distros):
$ sudo smartctl -a /dev/disk/by-id/ata-WDC_WD1002FAEX-00Y9A0_WD-WCAW34208404
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.3.11-arch1-1] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, [url=http://www.smartmontools.org]www.smartmontools.org[/url]
[… snip …]
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 5
3 Spin_Up_Time 0x0027 231 169 021 Pre-fail Always - 1425
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 306
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 015 015 000 Old_age Always - 62420
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 303
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 231
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 76
194 Temperature_Celsius 0x0022 101 095 000 Old_age Always - 46
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
[… snip …]
Writing and then reading the whole drive, as suggested by earlier posters, will invoke reallocation. But this is a lengthy process, not really suitable for everyday use by most people. A 4TB drive at 200MB/s will take 5.5h to write (only write!). That is for the upper end what you may expect from a 7200RPM drive — for most drives prepare for spending one day per drive, even if they’re smaller. And the information you gain from it is “this drive is unsuitable for further use” if the reallocated sector count grows above 0 during the operation. If it doesn’t, you gain no information at all about its future reliability.
Someone has suggested full self-test and that is also a good option, but of course equally time consuming and giving the same type of information. The advantage is that there is a non-zero chance that the vendor has added some extras to the test, ones you can’t perform yourself from the software or that require insider knowledge. While you can perform those from CLI tools (like
smartctl under Linux), interactive tools may be a better choice here, because they will give you indication of the test progress automatically and also handle cancelling etc. by themselves.
Unless you are really inclined to spend your time on testing your drives for backup purposes, I would suggest performing only the
short SMART test to determine if the drive’s circuitry, mechanics and calibration are still OK. You will never have a 100% certain backups, so just “stop worrying and learn to love the bomb”. It is rarely the case that data of size requiring a HDD is worth that much protection. There are exceptions, like data the law obligates you to keep or which, upon being lost, would cause significant financial/image damages to the company. But most likely those are things like movies downloaded from the internet or some other hoarding case, or photos you will probably never look at again. The time needed to perform tests can be spent much better with friends or family.
If you want to spend your time on that, I would rather suggest doing checksums of files on the original medium and then, after copying them, on the backup medium. Because that is actually delivering you some information.
The truly important data, e.g. private keys or password manager’s database, are no more than tens of megabytes. You can just buy a bunch of cheap USB sticks and store them there regularly, preferably spreading the media physcially across multiple places — in case of some disaster or a theft. Also develop the habit of checking your backups for integrity and have a fuckup policies in place. It is much more important than just blindly copying data.