Author Topic: HDDs and badblocks corrupt downloaded files?  (Read 18064 times)

0 Members and 1 Guest are viewing this topic.

Offline BradC

  • Super Contributor
  • ***
  • Posts: 2109
  • Country: au
Re: HDDs and badblocks corrupt downloaded files?
« Reply #100 on: October 08, 2020, 12:11:32 pm »
I remember in the old days using Spinrite (decades ago). Apparently it is still for sale, and there is an interesting (at least for me) 12 minute (approx), video about how it works in 2012. The video explains about 'bit rot', and how modern drives, with the ever smaller data bits (i.e. tiny physical dimensions, as stored on disks), because of the modern massive data capacities. Are susceptible, to errors, because of their tiny dimensions.
It also explains about how Spinrite using different strategies, to attempt to read (accurately) data, even if the ECC has gone too bad, to recover the data, via normal reads.

Don't bother reading or watching any of the Spinrite propaganda. It does precisely one thing. It reads each sector until it finds one it can't read, then it engages "dynastat". This performs repeated read-long commands on the sector and builds a statistic about what it thinks each bit should be. So for example, 4096 bits in a "sector", and it reads each sector 100 times. Any bit that was a 1 for more than 50 reads is flagged as a 1 and so on. It then writes the sector back to the disk. If at any time it gets a good read, it just writes that straight back.

Now, it does a credible job at that, but it really is a "one trick pony" and only "recovers" data in-place.

I haven't used it for a few years but it was still working on 4TB drives and they responded to read-long.

The rest of the purported features such as performing periodic maintenance to avoid bit-rot bordes on homeopathy, but he still sells enough copies to keep the lights on, and he certainly has a pack of religious followers.

I keep it as a tool in the box, but it gets used a lot less than it did in the days of ST-506 drives and floppies.
 
The following users thanked this post: tooki, MK14

Offline MK14

  • Super Contributor
  • ***
  • Posts: 4853
  • Country: gb
Re: HDDs and badblocks corrupt downloaded files?
« Reply #101 on: October 08, 2020, 01:08:32 pm »
Don't bother reading or watching any of the Spinrite propaganda. It does precisely one thing. It reads each sector until it finds one it can't read, then it engages "dynastat". This performs repeated read-long commands on the sector and builds a statistic about what it thinks each bit should be. So for example, 4096 bits in a "sector", and it reads each sector 100 times. Any bit that was a 1 for more than 50 reads is flagged as a 1 and so on. It then writes the sector back to the disk. If at any time it gets a good read, it just writes that straight back.

Now, it does a credible job at that, but it really is a "one trick pony" and only "recovers" data in-place.

I haven't used it for a few years but it was still working on 4TB drives and they responded to read-long.

The rest of the purported features such as performing periodic maintenance to avoid bit-rot bordes on homeopathy, but he still sells enough copies to keep the lights on, and he certainly has a pack of religious followers.

I keep it as a tool in the box, but it gets used a lot less than it did in the days of ST-506 drives and floppies.

Thanks, that makes sense.
Spinrite, has been available for a huge amount of time, so I'm not surprised by your comments. Hard disks, themselves, are gradually becoming ever so slightly more obsolete, as time goes on.
I think once SSDs, commonly/cheaply/reliably exceed, even the biggest hard disk drives, in terms of capacity (per $/£), HDDs, will disappear even sooner.

The two big things which concern me about the increasingly popular SSD drives, are their limited write endurance life, which potentially spoils their usage for some applications.
The other, is the (reportedly/specified), relatively short/limited (unpowered/'refreshed') data life. If the drive is left unpowered (especially), for long periods of time.
E.g. Some reports, seem to say the data life, can be as short as 1 year, on some modern SSD drives.

Unlike other parameters of an SSD (e.g. data capacity size, access speed and total data bandwidth), the data life is difficult for users to test themselves.

Some of the later HDDs, also seem to have these new technologies (which I haven't been keeping up with), which also seem to potentially limit the data write endurance and/or data lifetime. Because of the techniques, which allow modern HDDs, to have huge capacities.

In general user reviews/reports, seem to be rather/very negative about these new types of HDD drives, and annoyed/angry with how the hard drive manufactures, have relatively silently introduced these technologies, into their drive ranges. Without making it at all clear to the end users, exactly what type of drive they are really getting.
E.g. Saying this hard drive is meant for data archiving ONLY, or ONLY for limited data writes, per year.

I.e. making it more of a nightmare, to choose a decent/reliable massive capacity HDD, for backup or other important uses.

E.g. I somewhat recently bought some huge (to me) capacity external HDD units, WD (if I remember correctly). But apparently/reportedly, you have to actually open them up physically (which I'm reluctant to do). In order to find out if you have got the 'proper' HDD types, or these newer, less well respected HDD types.
From memory, WD Reds = Good, WD Whites = Not so good.

You use to be able to just simply splash out and buy quality Hitachi drives, but these days, most of the separate HDD manufactures, have (due to takeovers and mergers), become only a handful of very big HDD manufactures.
« Last Edit: October 08, 2020, 01:12:44 pm by MK14 »
 

Offline helius

  • Super Contributor
  • ***
  • Posts: 3664
  • Country: us
Re: HDDs and badblocks corrupt downloaded files?
« Reply #102 on: October 08, 2020, 04:25:45 pm »
every read of a flash cell reduces the amount of charge that is stored in that cell

Is that actually true? The structure of a bit in flash (or EPROM) is a floating gate insulated from the underlying transistor. Because it's floating, charges cannot leak away except by quantum tunneling. I thought it was pretty well established that the speed of tunneling was independent of read activity or indeed whether there is even power applied.
 

Offline Halcyon

  • Global Moderator
  • *****
  • Posts: 5870
  • Country: au
Re: HDDs and badblocks corrupt downloaded files?
« Reply #103 on: October 09, 2020, 06:14:56 am »
I remember in the old days using Spinrite (decades ago). Apparently it is still for sale, and there is an interesting (at least for me) 12 minute (approx), video about how it works in 2012. The video explains about 'bit rot', and how modern drives, with the ever smaller data bits (i.e. tiny physical dimensions, as stored on disks), because of the modern massive data capacities. Are susceptible, to errors, because of their tiny dimensions.
It also explains about how Spinrite using different strategies, to attempt to read (accurately) data, even if the ECC has gone too bad, to recover the data, via normal reads.

Don't bother reading or watching any of the Spinrite propaganda. It does precisely one thing. It reads each sector until it finds one it can't read, then it engages "dynastat". This performs repeated read-long commands on the sector and builds a statistic about what it thinks each bit should be. So for example, 4096 bits in a "sector", and it reads each sector 100 times. Any bit that was a 1 for more than 50 reads is flagged as a 1 and so on. It then writes the sector back to the disk. If at any time it gets a good read, it just writes that straight back.

Now, it does a credible job at that, but it really is a "one trick pony" and only "recovers" data in-place.

I haven't used it for a few years but it was still working on 4TB drives and they responded to read-long.

The rest of the purported features such as performing periodic maintenance to avoid bit-rot bordes on homeopathy, but he still sells enough copies to keep the lights on, and he certainly has a pack of religious followers.

I keep it as a tool in the box, but it gets used a lot less than it did in the days of ST-506 drives and floppies.

Ahhh Spinrite! Thanks for reminding me. I think they put more effort into the ASCII animations and various technical-looking screens than actually doing anything meaningful.

 
The following users thanked this post: MK14

Offline Wuerstchenhund

  • Super Contributor
  • ***
  • Posts: 3088
  • Country: gb
  • Able to drop by occasionally only
Re: HDDs and badblocks corrupt downloaded files?
« Reply #104 on: October 09, 2020, 08:37:13 am »
every read of a flash cell reduces the amount of charge that is stored in that cell

Is that actually true? The structure of a bit in flash (or EPROM) is a floating gate insulated from the underlying transistor. Because it's floating, charges cannot leak away except by quantum tunneling. I thought it was pretty well established that the speed of tunneling was independent of read activity or indeed whether there is even power applied.

It's been a while when I read about this (and I can't find the paper that discussed this right now) but from what I remember the issue was that during the read process of a NAND cell (where the flash controller applies a range of voltages to detect the cell's threshold voltage) charges trapped in the oxide breaks out, thereby reducing the cell's threshold voltage. It has no practical impact on older SSD types using SLC or (two-level) MLC flash but it's an issue with modern small 3+ level NAND cells (which, considering that a modern TLC cell stores less than 100 electrons, isn't surprising).

On top of that, with NAND flash there's also an effect called "Read Disturbance" where a read operation of on one cell shifts the threshold voltages of neighboring cells ("victim cells") in the same block so that they can reach a different logical state than the one that was written.

Reading doesn't wear out NAND flash but it far from being impact-less.
 
The following users thanked this post: helius, MK14

Offline Wuerstchenhund

  • Super Contributor
  • ***
  • Posts: 3088
  • Country: gb
  • Able to drop by occasionally only
Re: HDDs and badblocks corrupt downloaded files?
« Reply #105 on: October 09, 2020, 09:05:46 am »
So in short, you put a knowingly defective drive in a RAID array, and when the rebuild process falls over the drive's dead sectors you fix it by manually relocating corrupt host data to another sector so the rebuild process can be fooled into completing? Seriously???  |O

No, I replaced a good drive in the RAID array with a different good drive and bad sectors on one of the remaining drives stopped the rebuild.

I see. Not that this is much better ;)

Quote
Had I scrubbed the RAID array before doing the drive swap, then there would have been no problem.

The bad drive would still have been bad and should have been discarded.

Quote
Quote
The fact that the rebuild failed on the drive should have already been a warning that the best place for this drive would be the electronics recycling dumpster (or the shredder if the data was sensitive).

The drive was good; see above.  The only failure was your reading comprehension.

The drive you may have put in originally was good (that's the bit I got wrong), but the existing drive which exhibited bad sectors certainly wasn't.

Again, the rebuild failure should have been a warning sign that this drive is defective and should have been discarded.

As I said for hobbyist use there aren't any hard rules and everything goes, but if this was for data that really mattered then keeping a knowingly defective drive in the array would be grossly negligient.
 

Offline Wuerstchenhund

  • Super Contributor
  • ***
  • Posts: 3088
  • Country: gb
  • Able to drop by occasionally only
Re: HDDs and badblocks corrupt downloaded files?
« Reply #106 on: October 09, 2020, 10:19:42 am »
Nope. What you wrote would be (mostly) correct if we were talking about MFM/RLL/ESDI or early IDE (or SCSI-1 drives) from some 30 years ago. But we're not.

Modern (i.e. made in 2000 or later) drives don't represent their physical layout to the host. They report some artificial CHS layout which has nothing to do with the actual physical layout to the host for backward compatibility reasons (so that antique systems still CHS for addressing can boot from these drives). Even the sector size is often fake, as most modern hard drives use 4k sectors internally while reporting 512byte sectors on the interface.

But for most part the CHS layout isn't even used. LBA (Logical Block Adressing) has been a thing even before the year 2000 (it was first used with SCSI drives long before then), and it's been the standard way of addressing disks for many many years. With LBA, the host only sees a device which has a certain number of blocks. There's no CHS involved. LBA has been supported at least since Windows 98 and NT 4.0, and became the standard format with Windows 2000. And while LBA was an extension for IDE and ATA, LBA is the defined addressing standard for SATA, SAS and NVMe storage.

You RAID controller, whatever type/model that is, would have to be really old to not use LBA (and if that controller is so old then I guess the disks are, too). And even then it would only see the fake geometry reported by the drive, not the real one.

The only person here bringing up pre-IDE drives is you.

Yes, because that's the *only* generation where what you state could even be relevant for.

Your general understanding of this topic also appears to be rooted around that timeframe.

Quote
What part of "scrub-on-write" do you not understand?

What part of "it doesn't work the way you think you do" do you not understand?

Quote
Quote
If a modern RAID controller encounters a bad block (unrecoverable error), it will try to reconstruct the data from the redundancy disks and then may attempt to re-write the block to the affected disk. If this disk has sufficient spare sectors, it may revert the write to a spare block, after which the block will be fine and the integrity of the data is restored. If that was a one-off, the drive may well be fine for years. However, if that happens more often (as it's the case on a dying drive), the disk will eventually run out of spare sectors after which the attempt by the RAID controller to re-write the block will fail, and then the disk is failed and the array goes into contingency mode.

So I am wrong, but you agree while contradicting what you said earlier?

Nope. You just don't seem to understand that the checks and activities a RAID controller performs is at a different level than what the internal defect management of a hard drive does.

Handling of media defects is up to the drive's internal defect management, which (again) is *transparent* to the host. When it's not (i.e. the drive encounters an unrecoverable error) then that means the drive is telling you to that it is on its way out. When that happens in a disk that is part of a RAID array in redundant configuration (i.e. anything except RAID0 or JBOD) then the controller, in an attempt to maintain integrity of the RAID array, will reconstruct the data from it's redunancy elements and send the block to the drive for writing (where, if the drive's defect management will write it to a space physical sector if there are some left). If that fails (because there are no spare sectors left) then the drive is flagged for replacement.

Now what you seem to believe is that the RAID controller's handling of defective sectors is synonymous with the hard drive's internal defect management. But that is not the case. The RAID controller has zero control over how the drive handles defects, all it can do is to ask the drive to write a certain block, and whatever the RAID controller does it does so only to maintain the integrity of the RAID array (it's not to handle drive defects, which, again, is what the drive is responsible for).

Quote
Quote
Quote
It would be dumb to discard an entire drive because of a single bad read when the data can be recovered and written back to force the drive to reallocation that sector; of course firmware based RAID controllers often are this dumb.

In a professional environment, if a drive shows bad sectors at the interface it's scrapped, period. Any decent RAID controller will immediately flag a drive as soon as unrecoverable errors start to appear. Because for a modern drive unrecoverable errors are usually a clear sign that the drive is defective, and the only thing that would be stupid is to not scrap the drive and risk the integrity of the host data. It's a simple as that. Because at the end of the day the host data (and the hourly rate for the admin who has to deal with it and the potential fall-out should the drive remain in service) is worth a lot more than that stupid hard drive.

Now I accept that for hobbyist use that may well be different, and if you can't afford to replace a drive it's certainly tempting to work around the problem of a defect sector. But that is only a viable option if your data (and your time) isn't worth much (because if it was you'd not try to cheap-skate backup). And it doesn't change the fact that the drive is telling you that you can no longer rely on it.

That is a policy decision and not inherent to RAID or sector reallocation.  Why even bring it up?

Because its *not* a "policy decision", these are established procedures based on how storage systems actually work.

A drive that shows defects on the interface is no longer reliable and should be discarded. It's not rocket science to understand why this is established practice.

Quote
Quote
Quote
The internal defect management could operate on a correctable error, but I have never seen happen.

Well, yes, that's because it's supposed to be *transparent* to the host. Which is the whole point of a "defect-free interface".

It cannot be transparent because the visible reallocated sector count would increase.

"Transparent to the interface" means that drive defects are hidden to the host (i.e. the drive appears with zero defects).

The *only* exception are SMART data, because SMART data is supposed to show a drive's health status.

"Re-allocated Sectors" indicate how many times the drive encountered a defective sector, was able to reconstruct the data and moved the data to a spare sector.

"Recoverable Errors" are soft errors where the drive encountered an error in a sector but was able to restore the data and successfully rewrite the sector (no re-allocation required).

Both are "correctable errors"

"Unrecoverable errors" are when the drive encounters bad data and is incapable of restoring the original data.

Quote
Quote
Quote
I would have noticed if the defect list grew without errors being reported.  Many times I have done extended SMART surface scans and watched for this very thing and it never happened.  Most recently I have done it multiple times in the past couple of weeks on a pair of 1TB WD Greens, but doing an external surface scan which includes writing had some results.

Because you don't seem to fully understand what you are seeing. Any high level surface scan tool only scans the area that the drive is reporting as good (it has no access to the whole drive area - "defect-free Interface", remember?). Defects are hidden because defect management is completely *transparent*. The only way you can check what's actually going on in the drive is through SMART data.

When you're at the point where your tool can "see" defects then that means the drive has developed unrecoverable errors and is defective and should be discarded, but at the very least should not be used to store anything of importance.

Did I say "high level surface scan" or "SMART surface scan"?

It doesn't matter as SMART Short Test and SMART Extended Tests are both tests that only read the user area (not the extended area that contains the reserve blocks or any other surface area, e.g. the firmware areas some drives have). The only difference is where the results go (to the host PC on a regular surface scan, to the disk controller on a SMART Test).

On top of that, both SMART Short Test and Long Test are generally read-only. And they are so because on a healthy drive there shouldn't be any unrecoverable errors. Also, sector re-allocation normally only works for the user area, if there's a defect in the firmware area which can't be corrected by ECC then the drive is toast.

With modern drives, there is no way to perform a low-level format or even low-leve surface scan. Simple as that.

Quote
Quote
Quote
Quote
SSDs are a different matter as they should not have any patrol reads done on them, and less so any writes. SSDs also have internal defect management (which is part of it's Garbage Collection) and unlike with hard drives this normally works without having to be triggered (which for SSDs is done with the TRIM command).

The difference is that SSD must perform verification after every write as part of the write process, and they also scrub-on-read of sectors which are still correctable if they have too many soft errors.  Many perform idle time scrubbing because retention of modern Flash memory is abysmal, or at least their specifications say they do.  I sure know that many do not and will happily suffer bit rot while powered.

There's nothing in SSDs that matches "scrubbing". Which, on flash, would be completely counterproductive because every read of a flash cell reduces the amount of charge that is stored in that cell, so if a SSD did regular "read-scrubs" then the charge level would quickly become so low that the sector would have to be re-written to maintain its information, a process which causes wear on the flash cell.

Reads disturbance is observable and needs to be taken into account, but is minor compared to other factors.

Correct, although the effects become more pronounced the smaller the flash structures and the more levels a cell has to store.

Quote
Read disturbance and write disturbance also affect cells which are not being accessed, making idle time scrubbing in some form more important.

No, it doesn't (Read Disturbance is captured by ECC). Also, read cycles are monitored as is cell aging and Wear Leveling moves data to other blocks when ECC detects an error or a certain number of P/E cycles or read cycles have been performed, which alleviates the issue completely.

No patrol reads required.

Quote
Quote
Quote
However power loss protection is also required for *any* write operation, and possibly any erase operation.  The reason for this is that interrupted writes can not only corrupt the data which is being written, but also data in other sectors, including the flash translation table, which can result in a non-recoverable situation.

Their are two reasons data can be so easily corrupted with an incomplete write.  With multi-level flash, the existing data when a sector is updated is at risk during the update and it is easy to see why.  It seem to me like this is avoidable by only storing the data from one sector with multiple levels but apparently Flash chips are not organized that way.

First of all, SSDs internally are physically organized in blocks, not sectors. Remember the LBA I mentioned above? LBA is *exactly* how SSDs are structured internally. On SSDs, Sectors are an artificial construct which has zero relation to which flash cell the information is physically located.

That distinction is irrelevant for this discussion, which is not about write amplification.

It's relevant because there are no sectors in an SSD (there are blocks) while you talk about sectors which have no relevance for the location of the physical data.

Quote
Quote
Garbage Collection and Wear Levelling also don't need PLP. If power is interrupted during the deletion of a block then the block will remain marked as for deletion and deletion will be repeated after power comes back up. For Wear Levelling, when data is moved from one block to another then the data from the old block is copied to a new block and then the old block is marked for deletion. If that process is interrupted then the old data is still there and the block shift is simply repeated.

*Any* interrupted write can destroy data which is not in the current block.

No, it can't. Because the original block is erased only after the write process in the new block has completed. If that process is interrupted then the original data is still there.

Quote
Quote
Quote
The other reason is more insidious; the state machine which controls the write operation can glitch during power loss,

The part in a SSD which controls write operations (and everything else) is the SSD controller and that is not a State Machine, it's a micro-controller running specific software (the drive's firmware) to perform it's duties.

Oh good, they implemented the state machine in a micro-controller so all is solved!

I'm not sure you really understand the concept of "State Machine".

The flash controller in a typical SSD is not much different than the main processor of the host it is connected to. Many SSD controllers are ARM based.

Quote
It was not only OCZ drives which failed, but it was *every* drive which did not have power loss protection.  Of particular note is that all of the drives with Sandvine controllers, which were advertised as not requiring power loss protection, failed.

I really would like to see some evidence for that claim. Because it sounds bogus to me. Of course SSDs without PLP can lose data but they certainly shouldn't fail.

Besides, I don't know any "Sandvine" SSD controllers (I know SandForce controllers, part of LSI) but maybe I just missed it.
« Last Edit: October 11, 2020, 07:28:03 am by Wuerstchenhund »
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6762
  • Country: fi
    • My home page and email address
Re: HDDs and badblocks corrupt downloaded files?
« Reply #107 on: October 09, 2020, 10:48:06 am »
CompactFlash cards (as used on many portable devices like cameras) are very sensitive to power loss.  As in, there is a good chance a sudden power bricks the entire card.  Early USB memory sticks used to be very sensitive to that too, but I guess users are so careless at removing them that they had to make them more robust against power loss.

I am not aware of any SATA or SCSI drives (talking about spinning disks) manufactured in this millenium that are particularly sensitive to unexpected power loss.  (If power loss is associated with physical acceleration, like dropping, then physical damage may occur; but it is difficult to say how much if any that damage probability is due to power loss.)  Data loss is to be expected, depending on the file system used, but I am talking about drive functionality here.

I suspect that a big reason for this is the availability and price of supercaps, or ordinary high-capacitance capacitors, that can retain enough power to do a controlled shutdown even when supply cuts off unexpectedly.

If so, a particular batch of drives could be particularly vulnerable to sudden power loss, not because of firmware issues, but because of substandard capacitors, or cost-cutting at the OEM.  This kind of weakness often skews experiences at data centers and clusters, where you get dozens to hundreds of the same component from the same (or a few different) manufacturing batches.  (I do not have enough experience on SSDs to have an opinion; the clusters I admin'd used spinning rust.)

This means that even if you think you know something cannot happen, does not mean it is true; the underlying reason can be different, but the overall failure pattern the same.
 

Offline helius

  • Super Contributor
  • ***
  • Posts: 3664
  • Country: us
Re: HDDs and badblocks corrupt downloaded files?
« Reply #108 on: October 09, 2020, 05:10:57 pm »
Even hard drives from the 1960s were designed to survive power cuts without damage. A common design was a head-retract solenoid that could take power from the collapsing field of the spindle motor in case power was disrupted, pulling the heads away from the surface before loss of lift.

Voice-coil drives can unload the heads in tens of milliseconds, and the energy required is easily stored in a MLCC.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6762
  • Country: fi
    • My home page and email address
Re: HDDs and badblocks corrupt downloaded files?
« Reply #109 on: October 09, 2020, 07:00:10 pm »
Even hard drives from the 1960s were designed to survive power cuts without damage.
My first hard drive in 1989 or so, a 32 MByte MFM drive (which came formatted at 20 MBytes, on a Hyundai '286 clone with ATi EGA Wonder graphics card), had something like that implemented, but in real life, it would not survive too many cycles of that: one had to use the hdpark utility (if I recall the DOS command right) before shutdown.  I think it was just a cheap implementation in that drive... (Within three years, the motherboard gave up the ghost; then the ADI 14" monitor, and finally the HDD.  And boy was that machine loud, like a turbine engine whenever it was powered on.)

I mean, while something is supposed to be designed in, does not mean the bean counters etc. haven't insisted on so much cheapness it does not actually work that well in practice.  It is common in all fields of tools, really: given enough cost-cutting and striving for short-term gains over long-term ones, everything turns to shit.
 

Offline helius

  • Super Contributor
  • ***
  • Posts: 3664
  • Country: us
Re: HDDs and badblocks corrupt downloaded files?
« Reply #110 on: October 10, 2020, 02:18:38 am »
Oh, the small stepper-motor drives of the 1980s were definitely not as robust. In particular, they mostly didn't have the ability to unload the heads on power failure: the "Landing Zone" BIOS parameter and "PARK.COM" utilities worked by seeking to a dedicated non-data cylinder and then tying the poles of the stepper together so they resisted movement from that spot. MFM drives had no onboard controller, so they basically had no "brains" when it came to protecting themselves. If you forgot to run "PARK.COM" before cutting power, the heads would land on the data area* and might scrape off magnetic material. This was compounded by computers without soft power/shutdown capabilities, where cutting power was always needed to power off.

Related to this evolution and different levels of technology, Seagate has been the foremost company in most people's mental landscape for magnetic media, but the drives they developed in-house were the slow stepper-type described above. It was only when they acquired Imprimis from Control Data that they gained high speed, voice-coil operated, advanced drives with integrated controllers. The famous Seagate Hawk and Barracuda were Imprimis designs.

*This is the defining feature of a "Winchester" drive. The IBM Winchester project developed the first drives that didn't need emergency head-retract solenoids, because the heads had the capability to land on the surface without crashing and destroying themselves.
« Last Edit: October 10, 2020, 03:15:18 am by helius »
 
The following users thanked this post: Nominal Animal

Offline BradC

  • Super Contributor
  • ***
  • Posts: 2109
  • Country: au
Re: HDDs and badblocks corrupt downloaded files?
« Reply #111 on: October 10, 2020, 10:17:59 am »
Ahhh Spinrite! Thanks for reminding me. I think they put more effort into the ASCII animations and various technical-looking screens than actually doing anything meaningful.

It *is* very pretty. Back in the old days from memory it would also re-interleave your HDD to optimise the track layout for your machine. I always used Norton for that.

The "dynastat" recovery harks way back, and it is pretty clever when you think about it. Take many, many reads of the same sector with the ECC "disabled" and build a statistical version of what it thinks the sector is. That worked particularly well for floppies.

One of the other tricks it *had* up its sleeve was intermittent seeks between reads which would make interesting use of the backlash in the old stepper mechanisms (both FDD and HDD) to centerthe head over different parts of the same track. Of course that means nothing these days, but it was another neat trick.

When you re-read that, it is very much "take a pretty informed, but ultimately speculative guess at the real sector contents and then write it back as authoritative". All the other "maintenance" stuff placebo at best.

 
The following users thanked this post: MK14

Offline Wuerstchenhund

  • Super Contributor
  • ***
  • Posts: 3088
  • Country: gb
  • Able to drop by occasionally only
Re: HDDs and badblocks corrupt downloaded files?
« Reply #112 on: October 11, 2020, 07:37:02 am »
CompactFlash cards (as used on many portable devices like cameras) are very sensitive to power loss.  As in, there is a good chance a sudden power bricks the entire card.  Early USB memory sticks used to be very sensitive to that too, but I guess users are so careless at removing them that they had to make them more robust against power loss.

That doesn't make any sense. Removable storage like USB sticks or CF cards have no "power down" function (as for example SSDs have), in fact CF cards are widely used in devices which don't perform some kind of a shutdown but just remove power (and even those that do are susceptible to sporadic power loss).

It doesn't kill the CF card, no should it.

For removable devices, the only reason they should be "removed" via the appropriate OS functions is tell the OS to write out any data that is not written out and then stop writing to the device to avoid corrupting the file system on that drive. It has nothing to do with sudden power loss (which is and never was a problem for removable flash storage devices).
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6762
  • Country: fi
    • My home page and email address
Re: HDDs and badblocks corrupt downloaded files?
« Reply #113 on: October 11, 2020, 08:23:44 am »
CompactFlash cards (as used on many portable devices like cameras) are very sensitive to power loss.  As in, there is a good chance a sudden power bricks the entire card.  Early USB memory sticks used to be very sensitive to that too, but I guess users are so careless at removing them that they had to make them more robust against power loss.

That doesn't make any sense.
Sudden power loss during a write operation – say, updating directory access timestamps or similar; anything that requires an update of the internal state.  They just don't have the circuitry to deal with it.  Technically, it would be trivial: just add some capacitors, that provide enough current to finish any pending write operation.

Removable storage like USB sticks or CF cards have no "power down" function (as for example SSDs have), in fact CF cards are widely used in devices which don't perform some kind of a shutdown but just remove power (and even those that do are susceptible to sporadic power loss).
See what happens when you remove the device battery when it is writing to the CF card.  At least a few years ago, the chances were the card was bricked.

It has nothing to do with sudden power loss (which is and never was a problem for removable flash storage devices).
Hogwash.  CF cards maintain their own wear leveling tables (and other internal state), and for whatever reason, tend to get bricked whenever power loss occurs during the update of that state.  It has absolutely nothing to do with the host accessible data (filesystem et cetera), and everything to do with the device internal state.

Early USB memory sticks suffered from exactly the same problems.  I had several 64Mbyte and 128Mbyte sticks that died this way.  They soon fixed the issue; I do not know exactly how, but I believe by using capacitors to hold enough charge so that the internal state could always be updated successfully.

(One way to brick USB memory sticks is to short the device-side VUSB and GND via a small value bleed resistor immediately after the device side is disconnected from the host side, and do so when writing to the stick.  The bleed resistor depletes the capacitors, so the management chip is unable to correctly update its internal state.
Again, this has nothing to do with flash memory tech per se, and everything to do with the devices internal non-exposed state being only partially updated.)

Anecdotally, many CF cards still lack protection against sudden power loss.  One reason could be that since these use the ATA (PATA aka parallel-ATA) interface, the charge needed for a full internal state update is so large that many manufacturers skimp on the capacitors.  I suspect, but cannot verify, that CF Revision 4.1 (Power Enhanced CF) is related.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8640
  • Country: fi
Re: HDDs and badblocks corrupt downloaded files?
« Reply #114 on: October 11, 2020, 09:08:20 am »
Oh, flash memory corrupting at power loss does make sense.

Flash memory works by erasing into a certain state (usually logical '1' but in some implementations, logical '0' has been seen) a larger block (variably called "block", "sector", or something else) at a time; this operation is quite slow; then only the opposite state (usually logic '0') can be written bit-by-bit, as a relatively fast operation.

As a result, obviously, power loss is always a consideration when modifying anything, including any internal data. Any modification of the erasable unit which may contain changes from '0' to '1' requires read-the-entire-unit - erase - modify - write-back-the-entire-unit operation.

Such operations should be obviously avoided, but it's equally obvious that when there is a chance for a problem, in the field of quickly developing, cheap products, buggy or poor implementations are inevitable, but tend to get better over time.

But the big question is: What is the name of the function in HDDs that automatically prevents downloaded files from being saved in bad sectors?
« Last Edit: October 11, 2020, 09:11:01 am by Siwastaja »
 

Offline classicsamus87Topic starter

  • Regular Contributor
  • *
  • !
  • Posts: 97
  • Country: br
Re: HDDs and badblocks corrupt downloaded files?
« Reply #115 on: October 11, 2020, 03:42:33 pm »
What is the relationship of S.M.A.R.T with the automatic technology present in firmware for reallocation and isolation HDD defect sectors (badblocks)?
 

Offline Monkeh

  • Super Contributor
  • ***
  • Posts: 8042
  • Country: gb
Re: HDDs and badblocks corrupt downloaded files?
« Reply #116 on: October 11, 2020, 04:18:37 pm »
What is the relationship of S.M.A.R.T with the automatic technology present in firmware for reallocation and isolation HDD defect sectors (badblocks)?

Well, SMART took it out for dinner once, but it's purely platonic.
 
The following users thanked this post: BravoV, Siwastaja, tooki, Ysjoelfir

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6762
  • Country: fi
    • My home page and email address
Re: HDDs and badblocks corrupt downloaded files?
« Reply #117 on: October 11, 2020, 06:03:29 pm »
But the big question is: What is the name of the function in HDDs that automatically prevents downloaded files from being saved in bad sectors?
I do not believe there is a name for it, only the description – reallocation of bad/failed blocks/sectors.

I do believe all the most relevant papers and patents can be found using that description (those terms and their variants in English).  None of the patents I linked to earlier seem to use any common name for the operation, but they do describe the approaches to quite some detail.  (Including heuristic reallocation before actual failure based on soft errors/correctable ECC error rate per block, which is the only approach I know of that preserves existing data; unfortunately, since Samsung sold its hard drive unit to Seagate, it is impossible to tell if it is used in any commercially available spinny-disk drives.)

(Your suggestion for the name, "sector reallocation", works for me though.)
« Last Edit: October 11, 2020, 06:07:01 pm by Nominal Animal »
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8640
  • Country: fi
Re: HDDs and badblocks corrupt downloaded files?
« Reply #118 on: October 12, 2020, 06:00:02 am »
What is the relationship of S.M.A.R.T with the automatic technology present in firmware for reallocation and isolation HDD defect sectors (badblocks)?

Joking aside, serious answer to this is, SMART maintains and reports statistics about the reallocation (among many other things) to the user.
 

Offline BradC

  • Super Contributor
  • ***
  • Posts: 2109
  • Country: au
Re: HDDs and badblocks corrupt downloaded files?
« Reply #119 on: October 12, 2020, 10:59:20 am »
Joking aside, serious answer to this is, SMART maintains and reports statistics about the reallocation (among many other things) to the user.

Well, SMART is designed to maintain and report statistics about the reallocation (among many other things) to the user. What it does in reality is entirely what the firmware writer tells it to do, and it often "misrepresents the truth". Papers over imperfections, downright lies about other statistics and/or reports them in a "vendor specific" and mostly undocumented form.

The only thing SMART is reliably good for is making statistical comparisons between drives in a large population of identical or mostly similar drives.
 

Offline MK14

  • Super Contributor
  • ***
  • Posts: 4853
  • Country: gb
Re: HDDs and badblocks corrupt downloaded files?
« Reply #120 on: October 12, 2020, 07:38:10 pm »
What is the relationship of S.M.A.R.T with the automatic technology present in firmware for reallocation and isolation HDD defect sectors (badblocks)?

https://www.mjm.co.uk/articles/bad-sector-remapping.html

Quick Summary:
Once the bad-sectors, pool of spare resources has been so used up (via the HDD's firmware), the drive has been determined to be in need of replacement. The "backup and replacement is recommended" FLAG section of S.M.A.R.T. is set. To warn the user, that the disk is now considered "bad" and needs replacing.
« Last Edit: October 12, 2020, 07:41:11 pm by MK14 »
 

Offline classicsamus87Topic starter

  • Regular Contributor
  • *
  • !
  • Posts: 97
  • Country: br
Re: HDDs and badblocks corrupt downloaded files?
« Reply #121 on: October 12, 2020, 10:39:20 pm »
HDDs of the year 2000 have this automatic function in the firmware to reallocate and isolate defective sectors (bad blocks) of the HDD? I searched and didn't find much information because this function has no name


I downloaded important files on the HDD and I thought they were saved in the bad sectors
« Last Edit: October 12, 2020, 10:41:00 pm by classicsamus87 »
 

Offline MK14

  • Super Contributor
  • ***
  • Posts: 4853
  • Country: gb
Re: HDDs and badblocks corrupt downloaded files?
« Reply #122 on: October 13, 2020, 01:07:14 am »
this function has no name

If you search for "P-LIST G-LIST T-LIST", you should find some details.
They are the Production/Primary/Permanent, Growing and Track, (P/G/T) DEFECT LISTS.

E.g. Amongst the many links, you can find:
https://www.pcmag.com/encyclopedia/term/hard-disk-defect-management
 

Offline classicsamus87Topic starter

  • Regular Contributor
  • *
  • !
  • Posts: 97
  • Country: br
Re: HDDs and badblocks corrupt downloaded files?
« Reply #123 on: October 13, 2020, 01:29:46 pm »

Is this the name of the automatic reallocation function of defective sectors of the HDD present in the firmware?
 

Offline Monkeh

  • Super Contributor
  • ***
  • Posts: 8042
  • Country: gb
Re: HDDs and badblocks corrupt downloaded files?
« Reply #124 on: October 13, 2020, 03:31:35 pm »

Is this the name of the automatic reallocation function of defective sectors of the HDD present in the firmware?

No. The function is called Jim.
 
The following users thanked this post: BravoV, hexreader, tooki, Ysjoelfir


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf