Author Topic: how can I voluntarily damage an SSD to test a testing program?  (Read 3995 times)

0 Members and 1 Guest are viewing this topic.

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4227
  • Country: gb
how can I voluntarily damage an SSD to test a testing program?
« on: February 12, 2021, 07:58:34 pm »
I am writing a "storage" device testing program.

Yesterday I voluntarily damaged an old electro-mechanical Hard Disk Drive of 1.6 Gbyte in order to see how my testing program reacts to physical true damages and if it's able to find and report them.

I opened the cover, and made some scratches on the disk, and this is the result:

Code: [Select]
opening /dev/hdc ... done
disk_test
Checking for bad blocks in read-write mode
From 0 to 1.610.735.615, by 8.192 bytes at time
   [  0   ] BBBBBBBBB..BBBBB.BBBBBBBBBBBB........BBBBBBBBBBBBBBBBBBBBBBBBBBB
   [ 16.6 ] BBBBBBBBBBBBBBBBBBBBBBB.BBBBBBBBBBBBBBBBBB.BBBBBBBBBBBB....BBBBB
   [ 33.3 ] BBBBBBBBBBB...............BBBBBBBBBBB.BBBBBBBBBBBBB.BBBBBBBBBBBB
   [ 49.9 ] BB...BBBBBBBBBBBBBBB.BBBBBBBBBBBBBBB......BBBBBBBBBBBBBBBBBBBBBB
   [ 66.6 ] BBBBBBBBBBBBBBBBB....BBBBBBBBBB.BBBBBBBBBBBBBBB....BBBBBBBBB.BBB
   [ 83.3 ] BBBBBBBBBBBBBBBBBB.BBB.BBBBBB.BBBBB..BB..BB.BBBBB.BBB...........
   [ 99.9 ] .

"." means no problem
"B" means error during I/O

Excellent!!! Damages are there, perfectly discovered and reported! :D

Now I need a damaged old SSD in order to run similar tests. I can buy a second hand cheap SSD on Bonanza, but how can I voluntarily damage it?


(it sounds crazy, I know, but ... that's done for scientific reasons, so ... )
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Johnny10

  • Frequent Contributor
  • **
  • Posts: 900
  • Country: us
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #1 on: February 12, 2021, 08:26:56 pm »
Why not ask EEV Forum members if they will send you one?
Tektronix TDS7104, DMM4050, HP 3561A, HP 35665, Tek 2465A, HP8903B, DSA602A, Tek 7854, 7834, HP3457A, Tek 575, 576, 577 Curve Tracers, Datron 4000, Datron 4000A, DOS4EVER uTracer, HP5335A, EIP534B 20GHz Frequency Counter, TrueTime Rubidium, Sencore LC102, Tek TG506, TG501, SG503, HP 8568B
 

Offline Jwalling

  • Supporter
  • ****
  • Posts: 1517
  • Country: us
  • This is work?
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #2 on: February 12, 2021, 08:45:25 pm »
Fill your SSD with data, but leave 50KB or so free. then make a .cmd file that will keep writing a (just shy of) 50KB file to it.
Use two files, one with all 00s and one with all FFs. Alternate between the two.
Hopefully, it will wear out the flash in that area. Since the disk is almost full, that will subvert the wearing algorithm for the drive...
Jay

System error. Strike any user to continue.
 
The following users thanked this post: DiTBho

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4227
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #3 on: February 12, 2021, 08:49:56 pm »
Fill your SSD with data, but leave 50KB or so free. then make a .cmd file that will keep writing a (just shy of) 50KB file to it.
Use two files, one with all 00s and one with all FFs. Alternate between the two.
Hopefully, it will wear out the flash in that area. Since the disk is almost full, that will subvert the wearing algorithm for the drive...

Thanks! Great idea!  :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online Ian.M

  • Super Contributor
  • ***
  • Posts: 13049
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #4 on: February 13, 2021, 07:48:43 am »
AFAIK, you are in the UK.  You could try asking your nearest branch of CeX (second hand media & electronics) if they'd be willing to sell you SSDs that fail testing, but are still recognized by the controller.  You'd probably need to  check back with them regularly, as I'd bet at the moment their warranty returns get scrapped. 
« Last Edit: February 13, 2021, 07:50:25 am by Ian.M »
 
The following users thanked this post: DiTBho

Online MK14

  • Super Contributor
  • ***
  • Posts: 4883
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #5 on: February 13, 2021, 07:55:19 am »
Fill your SSD with data, but leave 50KB or so free. then make a .cmd file that will keep writing a (just shy of) 50KB file to it.
Use two files, one with all 00s and one with all FFs. Alternate between the two.
Hopefully, it will wear out the flash in that area. Since the disk is almost full, that will subvert the wearing algorithm for the drive...

Disclaimer: I'm absolutely NOT an expert on SSDs. So if anyone claims different, listen to them.

I thought SSDs, use over-provisioning, i.e. they have significantly more storage area, than claimed/reported. Specifically to counter, the very situation you are describing, and other reasons.
I.e. If it is advertised as being exactly 100GB, and reports being exactly 100GB to the OS, it really is something like 105GB+, in size. So it would merely use the remaining 5GB (which you CAN'T fill up with your data), to act as a wear leveling area, and unusable 'bad blocks' space.

Also note, that SSDs try and use their ram (specifically the types with supercapacitors, to hold the supply rails up, after power-off. I'm not sure how non-supercapacitor types, use ram), to store rapidly/regularly changing sectors. Which is really where the 50kB test file, will probably reside (until a power off cycle, forces it to write the ram buffered sectors onto the actual flash).

Additionally, SSDs, very extensively remap the sectors. So writing a 'bad block' testing/detection program, is probably extremely challenging. Because it won't readily tell you where sectors really are and/or let you access the 'removed from use' (bad-block) ones etc.

It sounds to me like your 'test program' will just see what the OS sees, which is the remapped and fault corrected sectors. Until it gets too many 'bad blocks', runs out of over-provisioning (and maybe other) areas, and then does something like, insist in only working in read only mode (for data recovery), and will refuse any further writes or changes to data on the SSD.
I.e. The SSD would be what some call 'Bricked'.

« Last Edit: February 13, 2021, 08:06:47 am by MK14 »
 
The following users thanked this post: hans, Fraser, janoc, newbrain

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #6 on: February 13, 2021, 08:20:20 am »
It sounds to me like your 'test program' will just see what the OS sees, which is the remapped and fault corrected sectors. Until it gets too many 'bad blocks', runs out of over-provisioning (and maybe other) areas, and then does something like, insist in only working in read only mode (for data recovery), and will refuse any further writes or changes to data on the SSD.
I.e. The SSD would be what some call 'Bricked'.

Right. Brick is most likely outcome in case of modern drive. Better get old "mechanical" 3.5" HDD known to have bad blocks. Hardware hoarders could have plenty, not to mention eBay.
 
The following users thanked this post: MK14

Online Ian.M

  • Super Contributor
  • ***
  • Posts: 13049
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #7 on: February 13, 2021, 08:35:38 am »
Some SSDs even deliberately brick themselves (i.e not recognized) on the next power cycle once they've detected critical failure.

« Last Edit: February 13, 2021, 08:37:32 am by Ian.M »
 
The following users thanked this post: MK14

Offline Halcyon

  • Global Moderator
  • *****
  • Posts: 5907
  • Country: au
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #8 on: February 13, 2021, 08:37:21 am »
Exactly what MK14 said. Plus a good test program will look at SMART data as well. What better source of data is there than the drive itself? If it's reporting problems, your program should be looking at that and reporting it too, not just blindly accepting what the OS tells you, particularly when it comes to SSDs.

 
The following users thanked this post: MK14

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4227
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #9 on: February 13, 2021, 09:59:48 am »
I have added the SMART check as "cross-check", thanks for the tips  :D

I need "guinea pigs", SSDs to used as a subject for experiments in order to detect false positives, false negatives, etc.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4227
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #10 on: February 13, 2021, 10:02:27 am »
AFAIK, you are in the UK.  You could try asking your nearest branch of CeX (second hand media & electronics) if they'd be willing to sell you SSDs that fail testing, but are still recognized by the controller.  You'd probably need to  check back with them regularly, as I'd bet at the moment their warranty returns get scrapped.

Great idea! Thanks! That's why I opened the topic in the Chat area, just to catch tips like this!
It seems the best idea ever  :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online hans

  • Super Contributor
  • ***
  • Posts: 1683
  • Country: nl
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #11 on: February 13, 2021, 10:02:47 am »
Some SSDs even deliberately brick themselves (i.e not recognized) on the next power cycle once they've detected critical failure.

I believe  some SSD firmware/controllers can  also go to a  read-only state on  extensive failures. That way easy data recovery is still possible.

HDD's also have reserve sectors by the way, so that damaged sectors may be discarded after an erase AFAIK. If you run badblocks  on a drive; never let it covince yourself of a failure with a single pass. Sometimes the drive can correct the damaged sector, update the SMART  data (which has pre-fail indicators like reallocated sector count), and go from there. If  it still fails after the drive has been completely write cycled; then for sure it's broken (e.g. by r/w head damage)
 
I've used a pre-fail disk for some time as a NAS scratchdisk for media downloads that were non-critical and reproducible. At the  time I wasn't in the position to buy  a bunch of HDD's for a RAID array. That drive, a notorious Seagate 2TB model, completely failed after about 1 year or so.. so yes it's a risk to keep running storage devices like that. But  I had no regrets; I only had a bunch of movies and IPTV recordings on that drive.

I have seen techsite hardware.info trying to wear out a Samsung 840 250GB TLC SSD in 2013. They went for the straightforward approach: fill the drive with 160GB of static data, then  keep refilling  the remaining area 24/7 till it  fails.  The drive was rated for 1000 P/E cycles. They saw their  first reallocated sector at 2945 P/Ecycles on 1 drive (707TBW). At 3187  P/E  cycles (764TBW) they saw an uncorrectable error  (512kB lost). The drive gave up at 3706 P/E cycles, with 888TBW. This  test took over 3 months to complete, running 24/7.

They  noted that the drive would reshuffle data in  the background  to even off wearing. I  also think that 250GB is an odd-ball value:  the drive probably has 256GB worth of TLC FLASH cells, but uses a portion of the cells as SLC cache (which has an order of magnitude higher wear tolerance) or reserve sectors. For the latter you probably only need a few hundred MB or several GB, depending  on how  precise you can specify  the P/E expectancy and variance. I.e. if you can predict only a few dozen cells have worn out at 1k P/E, you don't need to take up much reserve  space. 1k P/E cycles was quite conservative for this particular drive - they got almost 3x the lifetime out of it.

Source(In Dutch unfortunately): https://nl.hardware.info/artikel/4177/10/hardwareinfo-test-levensduur-samsung-ssd-840-250gb-tlc-ssd-eind-update-20-6-2013-update-9-eindconclusie-20-6-2013
« Last Edit: February 13, 2021, 03:31:19 pm by hans »
 
The following users thanked this post: MK14, DiTBho

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4227
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #12 on: February 13, 2021, 11:05:59 am »
Additionally, SSDs, very extensively remap the sectors. So writing a 'bad block' testing/detection program, is probably extremely challenging. Because it won't readily tell you where sectors really are and/or let you access the 'removed from use' (bad-block) ones etc.

Yup, this is a kind of "legacy" requirements needed to also test electro-mechanical HDDs.

The testing program needs to support both the technology, and offers the user the possibility to set up the testing mode.

[..] which is the remapped and fault corrected sectors. Until it gets too many 'bad blocks', runs out of over-provisioning (and maybe other) areas

This is what I am trying to study. I have zero experience with SSDs and flash-based storage devices.


ram-disks (volatile, not permanent, used by industrial machines with high IOp)
I also have to support and test the ram-disk, and in this case I have to test the amount of IOP that goes through an IO system request. If there is a delay, on a ram-disk made with ECC RAM, it means that something with the DRAM controller encountered a problem, the CRC failed and caught a parity error, and the controller tried to repeat the operation, and if this happens too many times ... well, you don't see it happening from the user-space, and even the kernel does not know anything about it, only the controller inside the ram-disk know what's happening, and won't tell you anything unless there is a solid failure, which is too late.

The failure can be detected before it becomes serious, if you disable caching, issue a large amount of IOp, measure the time, and you notice a selective delay in certain area, that means there is probably something ready to go wrong with the decoupling of the capacitors or something on the physical PCB of a specific area where DRAM chips are mounted! Or, worse still, part of the ECC ram is going to became faulty and unreliably and needs to be replaced.

My ram-disk has several ram-sticks installed. Something like 32 banks, 4Gbyte each. I did an experiment with a voluntary damaged ECC ram stick in one of the 32 banks, I removed a capacitor from an old ram stick, and my testing program caught it: bank#4 looks suspicious, slow IOp, please check it!

Bingo! :D

SMART does somehow support this, unfortunately my ram-disk devices do not have this diagnostic features implemented. If I issue a check, it always returns "not supported", or worse still, "all OK".

That's why I have added SMART as additional layer to run cross-tests: this way I can write only one C program, organized by C sub-modules, and use it to test three kinds of storage technology: HDDs, SSDs, and RAM-disks!
« Last Edit: February 13, 2021, 11:12:35 am by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: MK14

Offline Twoflower

  • Frequent Contributor
  • **
  • Posts: 742
  • Country: de
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #13 on: February 13, 2021, 11:12:04 am »
Fill your SSD with data, but leave 50KB or so free. then make a .cmd file that will keep writing a (just shy of) 50KB file to it.
Use two files, one with all 00s and one with all FFs. Alternate between the two.
Hopefully, it will wear out the flash in that area. Since the disk is almost full, that will subvert the wearing algorithm for the drive...
That probably does not work. The SSD will detect that some parts of the SSD is not written and others very often. My assumption is that the SSD controller will detect such thing and swap some of the static content to the often written cells. The controller most likely doesn't care about "empty" that much but on the write-cycle counter of a section and takes acting on that information.

@DiTBho good luck to wear down a SSD. A German newspaper (others would have done that too) did this in 2017 and the result was surprisingly well for the write-cycles. After five months nine out of twelve SSDs were dead. The 'worst' one was able to handle 188Tera Byte Written (manufacturer datasheet: 72TBW) the best in the test died at 4623TBW (manufacturer datasheet 150TBW). Of course modern drives will behave different. And probably the data will get corrupted within a short time as the FLASH cell insulation is heavily damaged after such treatment. They also noticed that the drives did not allow any access any more (neither read nor write). Some were not even recognized by the BIOS/UEFI.  The German article is behind a paywall but in case of interest: So lange halten SSDs.
 
The following users thanked this post: MK14

Offline tom66

  • Super Contributor
  • ***
  • Posts: 6967
  • Country: gb
  • Electronics Hobbyist & FPGA/Embedded Systems EE
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #14 on: February 13, 2021, 12:13:12 pm »
At a prior employer I was responsible for testing eMMC devices used in set top boxes for the 'trick play' function (records the last 30min of programming so you can pause and resume.)

The bottom line is the easiest way to kill flash memory is to get it hot.  At 45C ambient the lifespan of memory was a third as much at room temperature.  Only 1,000 cycles vs 3,000 cycles at room temperature.  And from documentation I could find, this is not unusual. Write endurance drops by about 30-40% every 10C rise.

And, when the devices failed, the device would basically not acknowledge reads to any parts that were damaged, and it would stall the OS.

So: write a lot of data to the memory, while it is hot.  Preferably, heat the drive up while doing so, provided it remains within thermal limits (it does not stop writing above a certain temperature - some drives do.)  Data patterns should cycle between 0 and 1, ideally, toggle every other bit or write random data to each page so that the average bit gets used for an erase at least every two operations.

If you start writing a lot of data to a drive in one file, it will move around that file,  even if the drive has used sectors ... it's going to be more 'productive' in terms of wear rate, if you write data to every sector, so large files.
 
The following users thanked this post: MK14, DiTBho

Online MK14

  • Super Contributor
  • ***
  • Posts: 4883
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #15 on: February 13, 2021, 12:32:55 pm »
On ebay, you can get second hand (possibly sometimes new), SSDs, lower capacity, for around (from memory), the £3 including delivery mark. Various types available, some sellers have quite a few to sell.
You could also make them a bulk, low ball offer.

E.g. £3.75 Delivered, here:

https://www.ebay.co.uk/itm/Ramaxel-8GB-Mini-SSD-S800-S-SATA-DOM-Disk-on-Module-drive/114641308772

If you shop around and are patient, you can probably get them for less, and/or bigger capacity ones. E.g. 16GB, etc.

Quote
Ramaxel 8GB Mini SSD S800-S SATA DOM Disk on Module drive
Condition:New
Configuration:
- Select -
Quantity:
1
7 available
24 sold / See Feedback
Price:
£3.75
 

Online MK14

  • Super Contributor
  • ***
  • Posts: 4883
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #16 on: February 13, 2021, 01:05:19 pm »
Additionally, SSDs, very extensively remap the sectors. So writing a 'bad block' testing/detection program, is probably extremely challenging. Because it won't readily tell you where sectors really are and/or let you access the 'removed from use' (bad-block) ones etc.

Yup, this is a kind of "legacy" requirements needed to also test electro-mechanical HDDs.

The testing program needs to support both the technology, and offers the user the possibility to set up the testing mode.

[..] which is the remapped and fault corrected sectors. Until it gets too many 'bad blocks', runs out of over-provisioning (and maybe other) areas

This is what I am trying to study. I have zero experience with SSDs and flash-based storage devices.


ram-disks (volatile, not permanent, used by industrial machines with high IOp)
I also have to support and test the ram-disk, and in this case I have to test the amount of IOP that goes through an IO system request. If there is a delay, on a ram-disk made with ECC RAM, it means that something with the DRAM controller encountered a problem, the CRC failed and caught a parity error, and the controller tried to repeat the operation, and if this happens too many times ... well, you don't see it happening from the user-space, and even the kernel does not know anything about it, only the controller inside the ram-disk know what's happening, and won't tell you anything unless there is a solid failure, which is too late.

The failure can be detected before it becomes serious, if you disable caching, issue a large amount of IOp, measure the time, and you notice a selective delay in certain area, that means there is probably something ready to go wrong with the decoupling of the capacitors or something on the physical PCB of a specific area where DRAM chips are mounted! Or, worse still, part of the ECC ram is going to became faulty and unreliably and needs to be replaced.

My ram-disk has several ram-sticks installed. Something like 32 banks, 4Gbyte each. I did an experiment with a voluntary damaged ECC ram stick in one of the 32 banks, I removed a capacitor from an old ram stick, and my testing program caught it: bank#4 looks suspicious, slow IOp, please check it!

Bingo! :D

SMART does somehow support this, unfortunately my ram-disk devices do not have this diagnostic features implemented. If I issue a check, it always returns "not supported", or worse still, "all OK".

That's why I have added SMART as additional layer to run cross-tests: this way I can write only one C program, organized by C sub-modules, and use it to test three kinds of storage technology: HDDs, SSDs, and RAM-disks!

If your testing program has to handle, so many (three), VERY significantly different storage mediums. It could be spreading your time and other resources, too thinly, to do a decent job.

My understanding, is that it is usually recommended, to NOT use Ram disks, for the real (actual/written) main/sole data store. But if the data is properly stored elsewhere, so it is just to speed things up or something, that is fine. It is a complicated subject area, so there are many other exceptions.
This is because it is too susceptible to mechanisms, which might corrupt or even fully wipe, all the data. Suddenly, and without much/any warning.

E.g. General DRAM memory bit flips, not all of which are handled by ECC systems. E.g. multiple-bit flips, in the same location in memory (64 bits), or even the address being used by the DRAM itself being corrupted (bit-flipped), so that the wrong memory address is read from or written too. Some systems CRC check the address information, as well.

Power failures of various kinds, not all of which will be prevented/stopped by UPS and other protective systems.

Various hardware failures, e.g. single power supply, or possibly a shorted device, if dual power supplies are fitted. I.e. If the data had been on a HDD, the hardware failure, shouldn't have damaged the data, but the volatile ram would lose all the data.

Various system crashes, MIGHT damage the ram disk's data.

The above are just some of the examples, of how it can go wrong, and why it is not recommended.

But ram disks, is opening up a can of worms, as regards possible differences of opinion. So I will end, by saying that some systems requirements, especially where very high performance is needed. May be somewhat forced to use very fast ram disks, in order to economically reach the desired performance.
Ideally such a solution, will/should as quickly as practicable, copy the data onto non-volatile and more reliable mediums.
SSDs and big SSD arrays, are already so fast, that they might be able to handle things quickly enough.

But developing software/techniques to detect when ECC ram has gone (or is going) bad, and identifying which slot/bank has/is failing/failed, sounds a good and interesting thing to research.

If you or the people you are helping, want data reliability. What about things like ZFS file systems, decent backup solutions and Raid (and similar) disk arrays ?
« Last Edit: February 13, 2021, 01:08:48 pm by MK14 »
 

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4227
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #17 on: February 13, 2021, 02:52:43 pm »
E.g. General DRAM memory bit flips, not all of which are handled by ECC systems. E.g. multiple-bit flips, in the same location in memory (64 bits), or even the address being used by the DRAM itself being corrupted (bit-flipped), so that the wrong memory address is read from or written too. Some systems CRC check the address information, as well.

The dram-controller should use 36bit, for 32bit of data, it means 4 bit for the check: isn't enough?
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online MK14

  • Super Contributor
  • ***
  • Posts: 4883
  • Country: gb
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #18 on: February 13, 2021, 03:12:57 pm »
The dram-controller should use 36bit, for 32bit of data, it means 4 bit for the check: isn't enough?

That is enough to detect and correct, only 1 bit error (flip). It can detect more, especially 2 bit errors at the same address, but hasn't got enough information to correct them, so is likely to fail/crash/reboot or whatever happens in that circumstance.
 
The following users thanked this post: DiTBho

Offline Jwalling

  • Supporter
  • ****
  • Posts: 1517
  • Country: us
  • This is work?
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #19 on: February 13, 2021, 07:52:13 pm »
Fill your SSD with data, but leave 50KB or so free. then make a .cmd file that will keep writing a (just shy of) 50KB file to it.
Use two files, one with all 00s and one with all FFs. Alternate between the two.
Hopefully, it will wear out the flash in that area. Since the disk is almost full, that will subvert the wearing algorithm for the drive...
That probably does not work. The SSD will detect that some parts of the SSD is not written and others very often. My assumption is that the SSD controller will detect such thing and swap some of the static content to the often written cells. The controller most likely doesn't care about "empty" that much but on the write-cycle counter of a section and takes acting on that information.


I believe that you're wrong, but I'm guessing as well ;) I don't think the controller will move data from allocated sectors.
Jay

System error. Strike any user to continue.
 

Offline Twoflower

  • Frequent Contributor
  • **
  • Posts: 742
  • Country: de
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #20 on: February 13, 2021, 08:06:14 pm »
I believe that you're wrong, but I'm guessing as well ;) I don't think the controller will move data from allocated sectors.
According to the Static wear leveling - Wikipedia or Wear Leveling - Transcent the re-allocation is done in the static and global wear leveling modes. And I think that is very common to all modern SSDs. They have do a lot to provide a usable reliability on the tiny multi-level cells. And doing this types of wear leveling come more or less for free. The over provisioning and error correction methods are expensive as that needs additional memory area.
 

Offline Jwalling

  • Supporter
  • ****
  • Posts: 1517
  • Country: us
  • This is work?
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #21 on: February 13, 2021, 09:23:41 pm »
I believe that you're wrong, but I'm guessing as well ;) I don't think the controller will move data from allocated sectors.
According to the Static wear leveling - Wikipedia or Wear Leveling - Transcent the re-allocation is done in the static and global wear leveling modes. And I think that is very common to all modern SSDs. They have do a lot to provide a usable reliability on the tiny multi-level cells. And doing this types of wear leveling come more or less for free. The over provisioning and error correction methods are expensive as that needs additional memory area.

Fair enough. But in the scenario I first described, will the controller have the time to do this? Just theorizing (guessing) again, but some of the wear leveling may normally occur when the drive is idle.
Jay

System error. Strike any user to continue.
 

Offline Twoflower

  • Frequent Contributor
  • **
  • Posts: 742
  • Country: de
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #22 on: February 13, 2021, 09:58:03 pm »
That's a good question. If a single write request (modern SSDs can handle many requests in parallel) can saturate the write speed of the drive depends on many tings. Especially with the different caching mechanisms in place (RAM and the SLC area, over provisioning). Also the drive controller can delay the write request do do the housekeeping. For example they do that if the SLC cache area is full and the data has directly written to the MLC region of the memory array.

A guess I can't find a documentation for is: A SSD controller will probably slow down the write requests from the computer to manage such an internal copy process and not risk dataloss by allowing such an attack you describe. A drive internal copy task will be so fast you won't notice that unless you do specific benchmarks. Especially the over provisioning has all time some free area to swap data in and out.

Probably the fastest way to damage the SSD would be to remove one flash chip and burn it to death with a microcontroller by writing to the same region over and over again and solder it back. A few hundred times will probably enough for modern MLC FLASH. But the SSD controller might notice that at the first write attempt to that region. As well as they do a write-verify procedure. So most likely the controller will mark the bad cells and write the data some where else before you'll see a bad content while read back.
 
The following users thanked this post: DiTBho

Offline james_s

  • Super Contributor
  • ***
  • Posts: 21611
  • Country: us
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #23 on: February 13, 2021, 10:27:48 pm »
I've encountered several failed SSDs and so far the failure mode I've seen was the drives turned read-only and for some reason this stopped Windows from booting. I was able to clone those to new drives and everything was recovered.
 
The following users thanked this post: MK14

Offline David Hess

  • Super Contributor
  • ***
  • Posts: 17106
  • Country: us
  • DavidH
Re: how can I voluntarily damage an SSD to test a testing program?
« Reply #24 on: February 14, 2021, 03:20:33 am »
Fill your SSD with data, but leave 50KB or so free. then make a .cmd file that will keep writing a (just shy of) 50KB file to it.
Use two files, one with all 00s and one with all FFs. Alternate between the two.
Hopefully, it will wear out the flash in that area. Since the disk is almost full, that will subvert the wearing algorithm for the drive...

At best that will rotate through the over-provisioned space however my understanding is that good SSD wear leveling algorithms will also rotate written data through used areas by swapping data around, so there is no pattern of writing which will not wear out all areas evenly.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf