Author Topic: Why NAND flash seems to be the only non-mechanical part that wear by usage?  (Read 4424 times)

0 Members and 1 Guest are viewing this topic.

Offline LoganTopic starter

  • Frequent Contributor
  • **
  • Posts: 345
  • Country: us
NAND will degrade by write operation, while there seems no such thing happening with CPU, RAM, etc (They don't have a shorter life with high load). Even LCDs, which work by rotate each pixel material, doesn't die faster if you frequently change screen content.
Why only NAND wearing out like a mechanical thing? (The more you use it, the faster it dies).
 

Offline Berni

  • Super Contributor
  • ***
  • Posts: 4956
  • Country: si
This is mainly because the flash cells are sort of abused during the erase.

The cell itself is a isolated piece of metal sitting on an insulator. So in order to shove electrons into that cell the voltage is raised so high that the insulator is brought to the edge of breakdown. This causes the electrons to force there way trough the insulator into the cell. Since this is fairly violent to the insulator this can cause it to slowly break down and eventually arc over, leaving the cell unable to hold charge because its no longer isolated.

As they push flash to higher and higher densities this problem typically gets worse. So even tho we once has flash with 1M write cycles, lots of flash is only 100k cycles while really high density flash used in thumb drives and SSDs is only 10k cycles or less.
 
The following users thanked this post: Logan

Offline Twoflower

  • Frequent Contributor
  • **
  • Posts: 737
  • Country: de
Just to mention other semiconductor components suffer also wear. The most common known might be electromigration. And as the transistors getting smaller there are other effects that earlier on got ignored are now in the line of sight (e.g. Hot-carrier injection similar to the FLASH wearing has impacts to FETs as well).
 
The following users thanked this post: Logan

Online ejeffrey

  • Super Contributor
  • ***
  • Posts: 3719
  • Country: us
Consider the on-off ratio needed for a flash cell compared to DRAM which is needs to be refreshed many times per second.  The flash cell needs to hold its charge for literally years yet be higher density than DRAM.  It doesn't take much damage to the insulator to allow leakage that will discharge the cell in "only" a week.
 
The following users thanked this post: Logan

Offline Mr. Scram

  • Super Contributor
  • ***
  • Posts: 9810
  • Country: 00
  • Display aficionado
The fact that NAND can hold a charge for extended periods of time boggles my mind. It working so well is a triumph of engineering.
 
The following users thanked this post: rs20, Logan

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
The reason why a flash cells wear by usage cannot be explained with classic physics and requires quantum physics, so I am not able to provide a reasonable scientific answer because when I was a student I chose to follow different university courses, mostly "computer science" and "digital electronic" rather than "solid state electronics".

Anyway, the reason is somehow classified under "due to the tunnel effect", so don't ask me more, but you can try to search online "flash" "tunnel effect" to better understand the inner reasons.

Be minded, to understand the inner mechanism of the "tunnel effect" and why it degrades the "floating gate" of a flash-cell it's required a solid knowledge of quantum physics.

In short:: "solid state electronics" -> "quantum physics" -> "tunnel effect" -> "floating gate flash-cell".
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: Logan

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
For a personal project I have recently used two Fe-RAM in a closed read-write-read loop to verify its maximum read/write endurance.

I can say no problems up to 10^12 cycles! My cheap flash chips all died around 10^9 cycles. So, it's a strong and better permanent storage technology!

Unfortunately it's more expensive than flash. But I dream an SSD made with FeRAM ;D
« Last Edit: March 19, 2021, 06:48:09 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: thm_w, SilverSolder, Logan

Offline AndyC_772

  • Super Contributor
  • ***
  • Posts: 4228
  • Country: gb
  • Professional design engineer
    • Cawte Engineering | Reliable Electronics
Also look up "oxide charging" - basically, extra electrons getting trapped within the insulating SiO2 layer of each memory bit each time a write is carried out. Over time these build up, they cannot escape (precisely because the SiO2 is a good insulator), but their electric field still has an effect on the gate. Once there's enough of them to turn a transistor on, that transistor is on forever, and the state of the memory bit becomes stuck.
 
The following users thanked this post: Logan

Offline golden_labels

  • Super Contributor
  • ***
  • Posts: 1209
  • Country: pl
Why only NAND wearing out like a mechanical thing? (The more you use it, the faster it dies).
As a counterexample: LEDs do wear out too. So do MOSFETs. So do CD-RW and DVD-RW media and it’s not mechanical, but chemical. EEPROM memories wear even faster than NAND flash. (a correct correction from wraper)

« Last Edit: March 23, 2021, 01:28:58 am by golden_labels »
People imagine AI as T1000. What we got so far is glorified T9.
 
The following users thanked this post: I wanted a rude username, Logan

Online Ian.M

  • Super Contributor
  • ***
  • Posts: 12860
Metal oxide varistors also have a finite life, degrading slightly with every surge till they continue conducting after the surge and die often spectacularly.
 
The following users thanked this post: Logan

Offline I wanted a rude username

  • Frequent Contributor
  • **
  • Posts: 627
  • Country: au
  • ... but this username is also acceptable.
NAND will degrade by write operation, while there seems no such thing happening with CPU, RAM, etc

A large-scale Google study found exactly the opposite: error rates in DRAM increase with both utilisation and age. Can't cheat the second law of thermodynamics.
 
The following users thanked this post: Logan

Offline Alex Eisenhut

  • Super Contributor
  • ***
  • Posts: 3338
  • Country: ca
  • Place text here.
Vacuum tubes are electrical non-mechanical parts that wear...  >:D

 :-DD
Hoarder of 8-bit Commodore relics and 1960s Tektronix 500-series stuff. Unconventional interior decorator.
 
The following users thanked this post: Logan

Offline KrudyZ

  • Frequent Contributor
  • **
  • Posts: 278
  • Country: us
Vacuum tubes are electrical non-mechanical parts that wear...  >:D

 :-DD

So do light bulbs; fuses wear out every time they do their job.
 
The following users thanked this post: tooki, Logan

Offline Cerebus

  • Super Contributor
  • ***
  • Posts: 10576
  • Country: gb
The fact that NAND can hold a charge for extended periods of time boggles my mind. It working so well is a triumph of engineering.

That's nothing. Intersil built a range of Floating Gate Array voltage references where the reference voltage is the charge stored on the gate(s) of CMOS transistors! The reference voltage drift is claimed to be \$10ppm \sqrt{kHr}^{-1}\$.

Datasheet: Here
Anybody got a syringe I can use to squeeze the magic smoke back into this?
 
The following users thanked this post: Sigurd, mawyatt, Logan

Offline exe

  • Supporter
  • ****
  • Posts: 2562
  • Country: nl
  • self-educated hobbyist
Intersil built a range of Floating Gate Array voltage references where the reference voltage is the charge stored on the gate(s) of CMOS transistors! The reference voltage drift is claimed to be \$10ppm \sqrt{kHr}^{-1}\$.

They disappeared from sales quite quickly. I suspect the technology wasn't reliable, or cost of manufacturing was expensive. I wanted to buy a few X60003 ICs from ebay a while ago, but those were expensive.
 
The following users thanked this post: Logan

Online mawyatt

  • Super Contributor
  • ***
  • Posts: 3269
  • Country: us
The fact that NAND can hold a charge for extended periods of time boggles my mind. It working so well is a triumph of engineering.

That's nothing. Intersil built a range of Floating Gate Array voltage references where the reference voltage is the charge stored on the gate(s) of CMOS transistors! The reference voltage drift is claimed to be \$10ppm \sqrt{kHr}^{-1}\$.

Datasheet: Here

I was at the presentation they made at the IEEE ISSCC a few years earlier. Absolutely brilliant concept and fascinating story how they employed the to-be-autioned-off old Flash memory line and repurposed it for the Flash Voltage References :)

Edit: Our biggest concern after discussing the SS physics with the inventors at the ISSCC Social was the long term effects of heavy ions, wonder if this had something to do with the apparent demise?

Best,
« Last Edit: March 20, 2021, 05:28:37 pm by mawyatt »
Curiosity killed the cat, also depleted my wallet!
~Wyatt Labs by Mike~
 
The following users thanked this post: tooki, Logan

Offline Cerebus

  • Super Contributor
  • ***
  • Posts: 10576
  • Country: gb
Intersil built a range of Floating Gate Array voltage references where the reference voltage is the charge stored on the gate(s) of CMOS transistors! The reference voltage drift is claimed to be \$10ppm \sqrt{kHr}^{-1}\$.

They disappeared from sales quite quickly. I suspect the technology wasn't reliable, or cost of manufacturing was expensive. I wanted to buy a few X60003 ICs from ebay a while ago, but those were expensive.

I suspect that customer reluctance to specify them for new designs was the thing that killed them. They were, in the minds of most folk, literally an incredible idea. Even though they actually worked and seemed to meet their specifications and hold them with time I don't think anyone, myself included, was prepared to believe in the in idea. Adopting them seemed like a leap of faith.

Also, I seem to recall that they played fast and loose with the noise specifications, or to be charitable calculated them on a flawed premise. They were not the first people to make exactly the same mistake - calculating noise from theory and omitting some 1/f effects - Analog Devices make the same mistake, now corrected, on one of their op amp ranges which stayed in their data sheets for some years (which ones escapes me at the moment).
Anybody got a syringe I can use to squeeze the magic smoke back into this?
 
The following users thanked this post: I wanted a rude username, Logan

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 2072
  • Country: br
    • CADT Homepage
In our pulse oximetry OEM modules we use TI MSP430F2616 MCUs that implement flash read modes with modified reference levels in hardware. This way it supports detection of flash bits that are probably going to change. In our system firmware we implemented a method that tests for and logs such abnormal behavior and refreshes those flash segments, somewhat similar to dynamic RAM. The procedure is a bit tricky, since system parts modifying flash cannot execute from flash.

As medical equipment, each and every failure comes back to my desk, but those Flash events are extremely rare. I guess crystal failures because somebody dropped the device are more likely. Just want to mention that Flash may also suffer from ionizing radiation, e.g. caused by cosmic ray events.

Regards, DIeter
 
The following users thanked this post: tooki, I wanted a rude username, Logan

Offline Rick Law

  • Super Contributor
  • ***
  • Posts: 3442
  • Country: us
...
Anyway, the reason is somehow classified under "due to the tunnel effect", so don't ask me more, but you can try to search online "flash" "tunnel effect" to better understand the inner reasons.

Be minded, to understand the inner mechanism of the "tunnel effect" and why it degrades the "floating gate" of a flash-cell it's required a solid knowledge of quantum physics.

In short:: "solid state electronics" -> "quantum physics" -> "tunnel effect" -> "floating gate flash-cell".

To understand Tunneling Effect, one doesn't need to know a lot of quantum physics.  One merely has to understand that at very small scale, particles can behave like waves.  To discuss what is "waving" would require more physics.

In the Copenhagen Interpretation (which has notable supporters like Niels Bohr and Werner Heisenberg), the wave is the probability of where you would found the particle if you look.  This is what I will use for the brief explaination below.

That said, lets go into what quantum tunneling is:

Say you have a marble in a bowl and you leave it there and no one else is around to touch it.  So, you expect to find it in the bowl when you come back a few minutes later.  Some how, when you are back later, the marble is sitting next to the bowl as if the marble roll up the side of the bowl and drop down the other side next to it -- or as if it simply tunnel to the other side.  For these kinds of events, the "tunnel" is the name that sticks.

For an electron or other particles that small, the wave property dominates.  The wave is everywhere and the amplitude is the probably of where the electron would be if you look.  Since the electron is inside and at the bottom of an energy bowl (ie, a bottom-pit of low energy), it thus has to "climb" up the higher energy wall of the bowl's sides in order to get out of it - it is contained by an energy-wall all around it.  But as a wave, the probability of find it is: anywhere you look, it could be there.  Outside the bowl, the probability is small but non-zero.  It is possible that you could find it outside the bowl and indeed it happens.  Just like throwing a thousand coins and they all landed "heads", the probability is very small but it could happen.  If you can throw the thousand coins an infinite number of times, you can imagine, it WILL happen eventually.  The electron acted as if it dug a tunnel to get to the side of the energy bowl.

For an electron to suddenly ended up outside the bowl, the probability is very small , but given enough time it will happen.  The higher the energy barrier, the lower the probability of it tunneling over, lower but still non-zero.  Mathematically speaking, if the probability of a "some particular thing" to happen is very very small but NON-ZERO.  Given enough time, it WILL happen.

When I was in college (majoring in Physics) and just being an annoying individual, I like to rattle my fellow student friends who major in EE that but for tunneling effect, you won't even have lights in the room.  Think about it, the copper contacts on the light switch would have a coating of non-conductive copper oxide at the moment the copper first made contact with air.  How then does the switch make electrical contact?  Well, tunneling came to the rescue to get things started, then avalanche effect takes over.

References:
I will reuse a citation in an earlier reply explaining tunneling with a different example.  Citation here:

References:
[1]

Electric Contacts, Theory and Application by Ragnar Holm, Published by Spinger-Verlag Berlin Heideberg GmbH

4th edition, Page 136 :
"...It is generally assumed that the first stage of any high resistivity film breakdown is the injection of electrons into the film [ of oxides ] by kind of field emission or Zener effect.  The strong field makes the boundary barrier steeper and consequently thinner so that electrons can tunnel though the hill... ...the material within the path is strongly heated by the current and thereby the cohesion of the film material is diminished... ...a channel through the film is produce"

EDIT: Corrected a couple of spelling errors and typo.
« Last Edit: March 22, 2021, 07:12:14 am by Rick Law »
 
The following users thanked this post: Logan

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 2072
  • Country: br
    • CADT Homepage
Would be nice to have some numbers, e.g. how many electrons are in a state-of-the-art NAND cell?
As far as i remember the likelyhood or let's say speed of charge loss depends exponentially on the ratio of thermal excitation energy and well depth. If i remember correctly, an EPROM or flash memory will last hundreds of years at 20 °C.
It also depends on memory size: The same small chance gets bigger, when there are 10**9 or more cells like in modern SSDs. And in SSDs they store up to four bits in one cell (QLC). Guess those products are similar to a CDROM or DVD in that they can only work with smart redundancy.

Regards, Dieter
 
The following users thanked this post: Logan

Offline exe

  • Supporter
  • ****
  • Posts: 2562
  • Country: nl
  • self-educated hobbyist
Would be nice to have some numbers, e.g. how many electrons are in a state-of-the-art NAND cell?

Heard in one article about QLC cells that there are literaly tens of electrons, and margins are very low. It requires a very precision voltage drive, all that in mass-produced electronics in noisy environment and large temperature variation.

Can't find that source. In this article they say about "dozens of electrons per cell", but don't provide any specific numbers: https://www.crss.ucsc.edu/event/365.html

PS I didn't know that thereare PLC (penta-cells?) which hold 5 bits. But that sounds scary. Endurance is predictably low: https://blocksandfiles.com/2019/08/07/penta-level-cell-flash/
 
The following users thanked this post: Logan

Offline wraper

  • Supporter
  • ****
  • Posts: 16865
  • Country: lv
EEPROM memories wear even faster than NAND flash.
Opposite of reality. EEPROM lasts more rewrites than even SLC NAND, not to say MLC, TLC and QLC.
 
The following users thanked this post: golden_labels, Logan

Offline Alex Eisenhut

  • Super Contributor
  • ***
  • Posts: 3338
  • Country: ca
  • Place text here.
OOh, and speaking of tunnel, tunnel diodes wear out, their peak current drifts all over the place, carbon composition resistors drift up in value, except values under about 10 ohms which seem to drift down.
Has anyone mentioned electrolytics?  8)
Permanent magnets can wear out if heated over their Curie point.
I know I already said vacuum tubes, but CRTs wear out.
And of course, the famous Commodore 64 chip, the PLA, routinely fails all by itself even unpowered. And Micron Technologies 64Kx1 DRAMs from the 1980s also fail by themselves. I heard MT 256Kx1 chips also fail a lot.
Hoarder of 8-bit Commodore relics and 1960s Tektronix 500-series stuff. Unconventional interior decorator.
 
The following users thanked this post: Logan

Offline Circlotron

  • Super Contributor
  • ***
  • Posts: 3180
  • Country: au
On the flip side, I expect air cored inductors, wire wound resistors and epoxy dipped mica capacitors would last quite a while in a benign environment.
« Last Edit: March 23, 2021, 01:38:23 am by Circlotron »
 
The following users thanked this post: Logan

Offline exe

  • Supporter
  • ****
  • Posts: 2562
  • Country: nl
  • self-educated hobbyist
Ah, wait a minute, it seems FGA references from renesas is still a thing. I think I see ISL60002 in stock in mouser. Wow, unbelievable. I might buy a few just for fun.

Here is the list of devices: https://www.renesas.com/us/en/products/power-power-management/voltage-references?field-technology=Floating%20Gate%20Array&method-field-technology=OR .

TBH, looks like a niche device because tempco is not that great (10-50ppm), also noisy density of 1.1 uV/sqrt(Hz) is the biggest I've seen yet. Can it be that noise is the down-side of their power consumption of less than 1uA? Price-wise they are imho expensive, $2-3 in single quantities.
 
The following users thanked this post: Logan

Offline Cerebus

  • Super Contributor
  • ***
  • Posts: 10576
  • Country: gb
TBH, looks like a niche device because tempco is not that great (10-50ppm), also noisy density of 1.1 uV/sqrt(Hz) is the biggest I've seen yet. Can it be that noise is the down-side of their power consumption of less than 1uA? Price-wise they are imho expensive, $2-3 in single quantities.

That figure is only for the 0.1Hz to 10Hz band, the good old 1/f noise region, so it's not as terrible as it sounds. They give typical figures of 400uV ptp in the 10kHz - 1MHz band (402\$nV/\sqrt{Hz}\$ ptp ~67 \$nV/\sqrt{Hz}\$ rms). The part has the usual CMOS problem of a high 1/f noise corner. Still pretty horrible, but not quite as horrible as that 1.1 \$\mu{V}/\sqrt(Hz)\$ (rms) figure suggests.
Anybody got a syringe I can use to squeeze the magic smoke back into this?
 
The following users thanked this post: exe

Offline GlennSprigg

  • Super Contributor
  • ***
  • Posts: 1259
  • Country: au
  • Medically retired Tech. Old School / re-learning !
I read an interesting webpage just the other day about this...
https://helpdeskgeek.com/reviews/everything-you-need-to-know-about-ssd-wear-tear/

And also a link on that page about "Wear Leveling" software these days...
https://searchstorage.techtarget.com/definition/wear-leveling
It's amazing how the software actually monitors the SSD sections, for writing and deleting data, so that the
next 'write' doesn't use the same/recently used areas, to prolong their 'cell' life !!  :phew:
Diagonal of 1x1 square = Root-2. Ok.
Diagonal of 1x1x1 cube = Root-3 !!!  Beautiful !!
 

Offline Alex Eisenhut

  • Super Contributor
  • ***
  • Posts: 3338
  • Country: ca
  • Place text here.
Here's a worrisome one
https://www.npr.org/templates/story/story.php?storyId=112003322

It's a mass standard, so it's not electronic, yet it's not moving either, so not mechanical?

It appears to change mass over time, so it's no longer accurate, which I guess means it's "wearing"?  ;D
Hoarder of 8-bit Commodore relics and 1960s Tektronix 500-series stuff. Unconventional interior decorator.
 

Offline Cerebus

  • Super Contributor
  • ***
  • Posts: 10576
  • Country: gb
Here's a worrisome one
https://www.npr.org/templates/story/story.php?storyId=112003322

It's a mass standard, so it's not electronic, yet it's not moving either, so not mechanical?

It appears to change mass over time, so it's no longer accurate, which I guess means it's "wearing"?  ;D

Somebody's not keeping up with the times:

Since 20 May 2019 the kilogram is defined in terms of the Planck constant, a fundamental constant of quantum physics, which by its nature is invariant and universally accessible. It has replaced the last artefact definition of the SI, the mass of a unique object known as the "international prototype of the kilogram" which had served to define the kilogram since 1889.
Anybody got a syringe I can use to squeeze the magic smoke back into this?
 
The following users thanked this post: Alex Eisenhut

Offline Alex Eisenhut

  • Super Contributor
  • ***
  • Posts: 3338
  • Country: ca
  • Place text here.
Well that's a relief. Still means matter just sitting there is constantly doing something whether we want it to or not.
Makes me grateful for my DNA and mitochondria, staving off entropy temporarily.
Hoarder of 8-bit Commodore relics and 1960s Tektronix 500-series stuff. Unconventional interior decorator.
 

Online Doctorandus_P

  • Super Contributor
  • ***
  • Posts: 3361
  • Country: nl
Electrolytic capacitors wear out too.
 

Offline NiHaoMike

  • Super Contributor
  • ***
  • Posts: 9018
  • Country: us
  • "Don't turn it on - Take it apart!"
    • Facebook Page
Also look up "oxide charging" - basically, extra electrons getting trapped within the insulating SiO2 layer of each memory bit each time a write is carried out. Over time these build up, they cannot escape (precisely because the SiO2 is a good insulator), but their electric field still has an effect on the gate. Once there's enough of them to turn a transistor on, that transistor is on forever, and the state of the memory bit becomes stuck.
So in theory, Flash could be "refreshed" by zapping it with intense UV light to dissipate the electrons like is done with the old UV EPROMs?
Cryptocurrency has taught me to love math and at the same time be baffled by it.

Cryptocurrency lesson 0: Altcoins and Bitcoin are not the same thing.
 

Offline DC1MC

  • Super Contributor
  • ***
  • Posts: 1882
  • Country: de
Also look up "oxide charging" - basically, extra electrons getting trapped within the insulating SiO2 layer of each memory bit each time a write is carried out. Over time these build up, they cannot escape (precisely because the SiO2 is a good insulator), but their electric field still has an effect on the gate. Once there's enough of them to turn a transistor on, that transistor is on forever, and the state of the memory bit becomes stuck.
So in theory, Flash could be "refreshed" by zapping it with intense UV light to dissipate the electrons like is done with the old UV EPROMs?

Heat can also be used, some while ago an interesting patent and proof of concept appeared where along the flash structures they put some resitive silicon bars to heat them and chase away the trapped electrons. Sadly due to flash prices crashing and the current behavior being most helpful for planned obsolescence, nothing have been heard of this research anymore.

 Cheers,
 DC1MC
 

Offline SeanB

  • Super Contributor
  • ***
  • Posts: 16284
  • Country: za
So just desoldering and resoldering a flash chip can actually reform any failing cells. Wonder if there will be some enterprising people simply using a reflow oven to reflow entire boards of flash memory, just to revive them.
 

Offline AndyC_772

  • Super Contributor
  • ***
  • Posts: 4228
  • Country: gb
  • Professional design engineer
    • Cawte Engineering | Reliable Electronics
No, that's not something I've ever heard of being the case - though I wouldn't be at all surprised to find it's possible to completely wreck a flash device by overheating it.

Offline Syntax Error

  • Frequent Contributor
  • **
  • Posts: 584
  • Country: gb
NAND cells are rather like the eponimous game of ball-in-a-cup; there is a limit to how many times that you can catch the ball in the cup - before either the string or the cup breaks!

NAND chips often leave the production line with bad blocks. Sub 100% quality is how manufacturers are able to achieve competitive volume pricing. One or two bad blocks on a new chip is common. See your Linux system bootlog to view the bad block count. Filesystems such as UBIFS are designed to deal with bad blocks and wear levelling in NAND.

For example, this is from a new-ish router:
Quote
nand: device found, Manufacturer ID: 0x01, Chip ID: 0xf1
nand: AMD/Spansion S34ML01
nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
Bad block table found at page 65472, version 0x01
nand_read_bbt: bad block at 0x000007a20000

I should add for the worried, no bad blocks in a NAND is rare. As with a mechanical HDD, a few bad blocks is no real issue. As for NAND implemented SSD drives, expect to replace them over the same timescale as mechanical HDDs. Mantra > NAND does not last forever.
« Last Edit: May 03, 2021, 09:24:39 am by Syntax Error »
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16865
  • Country: lv
NAND cells are rather like the eponimous game of ball-in-a-cup; there is a limit to how many times that you can catch the ball in the cup - before either the string or the cup breaks!

NAND chips often leave the production line with bad blocks. Sub 100% quality is how manufacturers are able to achieve competitive volume pricing. One or two bad blocks on a new chip is common. See your Linux system bootlog to view the bad block count. Filesystems such as UBIFS are designed to deal with bad blocks and wear levelling in NAND.

For example, this is from a new-ish router:
Quote
nand: device found, Manufacturer ID: 0x01, Chip ID: 0xf1
nand: AMD/Spansion S34ML01
nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
Bad block table found at page 65472, version 0x01
nand_read_bbt: bad block at 0x000007a20000
I should add for the worried, no bad blocks in a NAND is rare. As with a mechanical HDD, a few bad blocks is no real issue. As for NAND implemented SSD drives, expect to replace them over the same timescale as mechanical HDDs. Mantra > NAND does not last forever.
What a load of half-truths. On brand new chips even hundred of bad blocks is normal, and they are marked at factory. Low quality NAND goes into memory cards, higher quality goes into SSD and other applications with higher reliability requirements. In memory cards, eMMC in smartphones and SSD in computers bad blocks are dealt with by controller chip in storage device and are not seen by OS. Situation when new bad blocks (besides production defects) appear is not normal, and especially they must not be present on a brand new storage device. The only case when you can normally see those bad blocks is in unmanaged flash memory devices, like that router. And there is only a single bad block because that chip has a very low size.
Quote
As with a mechanical HDD, a few bad blocks is no real issue.
A few new bad blocks appearing on HDD is a strong sign that quite likely it will die soon together with all of your data, and that some of your data likely already got corrupted. Bad blocks found during production of HDD are completely hidden from OS.
« Last Edit: May 03, 2021, 05:03:31 pm by wraper »
 
The following users thanked this post: newbrain

Offline Berni

  • Super Contributor
  • ***
  • Posts: 4956
  • Country: si
Yeah the number of bad blocks is so low because its a small flash chip and its SLC flash.

This is the best flash type that only stores 1 bit per cell. Even the nice expensive SSDs don't use this anymore, they typically use the MLC flash(2 bits per cell). Most of the usual midrange SSDs use TLC (3 bits per cell) flash while the new cheep SSDs are going for QLC (4 bits per cell)

Bad blocks are becoming so common that a 64GB SD card is often a 128GB SD card that had too many blocks dead to have 128GB of usable space. They just pick out the best flash chips for the 128GB models.

This is nothing new, it massively improves the yield in silicon production. For example the Nvidia GPUs for the GTX x70 models (970 1070 2070...) are the same GPUs as on the faster GTX x80 models (980, 1080, 2080...), except that the 70 model had too many dead cores on it, so they just disable the dead cores and specify this chip as having fewer CUDA cores.
 

Offline tszaboo

  • Super Contributor
  • ***
  • Posts: 7390
  • Country: nl
  • Current job: ATEX product design
Intersil built a range of Floating Gate Array voltage references where the reference voltage is the charge stored on the gate(s) of CMOS transistors! The reference voltage drift is claimed to be \$10ppm \sqrt{kHr}^{-1}\$.

They disappeared from sales quite quickly. I suspect the technology wasn't reliable, or cost of manufacturing was expensive. I wanted to buy a few X60003 ICs from ebay a while ago, but those were expensive.

I suspect that customer reluctance to specify them for new designs was the thing that killed them. They were, in the minds of most folk, literally an incredible idea. Even though they actually worked and seemed to meet their specifications and hold them with time I don't think anyone, myself included, was prepared to believe in the in idea. Adopting them seemed like a leap of faith.

Also, I seem to recall that they played fast and loose with the noise specifications, or to be charitable calculated them on a flawed premise. They were not the first people to make exactly the same mistake - calculating noise from theory and omitting some 1/f effects - Analog Devices make the same mistake, now corrected, on one of their op amp ranges which stayed in their data sheets for some years (which ones escapes me at the moment).
Wait, what? They literally just shovel a bunch of electrons into a bucket floating gate, and sell it as a refence voltage? That is indeed incredible.
 

Offline TMM

  • Frequent Contributor
  • **
  • Posts: 471
  • Country: au
I read an interesting webpage just the other day about this...
https://helpdeskgeek.com/reviews/everything-you-need-to-know-about-ssd-wear-tear/

And also a link on that page about "Wear Leveling" software these days...
https://searchstorage.techtarget.com/definition/wear-leveling
It's amazing how the software actually monitors the SSD sections, for writing and deleting data, so that the
next 'write' doesn't use the same/recently used areas, to prolong their 'cell' life !!  :phew:
This is nothing new, engineers have had to do this with other types of non volatile memory for decades in applications where a lot of writes take place.

e.g. In an a car's digital dashboard cluster, I found that they stored the odometer reading across 20 EEPROM registers, where it incremented each of the 20 registers in sequence every kilometer the car traveled, therefore, when the car has done 1,000,000kms, each register has only performed 50,000 writes and so the system probably still works at that point.

I'd imagine all kinds of industrial equipment which keep track of the number of repetitive cycles in an EEPROM do the same thing

On a related note, some older Tesla vehicles have started to fail because Telsa don't take enough precautions to limit the number of writes to the eMMC flash:
https://www.tomshardware.com/news/flash-memory-wear-killing-older-teslas-due-to-excessive-data-logging-report

So just desoldering and resoldering a flash chip can actually reform any failing cells. Wonder if there will be some enterprising people simply using a reflow oven to reflow entire boards of flash memory, just to revive them.
The harder part is probably resetting the system which keeps track of which cells are bad. Even if you revive the cells, they are still marked as being bad.
« Last Edit: May 04, 2021, 12:42:13 pm by TMM »
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16865
  • Country: lv
Heat can also be used, some while ago an interesting patent and proof of concept appeared where along the flash structures they put some resitive silicon bars to heat them and chase away the trapped electrons. Sadly due to flash prices crashing and the current behavior being most helpful for planned obsolescence, nothing have been heard of this research anymore.
I see it as yet another conspiracy nonsense. Some proof of concept which is barely related to ICs does not mean it is a viable option for real devices.
 

Offline DC1MC

  • Super Contributor
  • ***
  • Posts: 1882
  • Country: de
Heat can also be used, some while ago an interesting patent and proof of concept appeared where along the flash structures they put some resitive silicon bars to heat them and chase away the trapped electrons. Sadly due to flash prices crashing and the current behavior being most helpful for planned obsolescence, nothing have been heard of this research anymore.
I see it as yet another conspiracy nonsense. Some proof of concept which is barely related to ICs does not mean it is a viable option for real devices.

Well, what can I say, that's a friendly fellow, the guys at Macronix usually are making pizza, not flash or other kinds of memory, but I would say for those interested read the original article and judge for yourself:

https://phys.org/news/2012-12-taiwan-defeat-limits-memory.html

https://ieeexplore.ieee.org/document/6479008

Cheers,
DC1MC

P.S: Sadly just desoldering it will not heal a flash :(
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16865
  • Country: lv
Quote
For their upcoming IEEE presentation, they said they propose and demonstrate a novel self-healing flash, where a high temperature (>800°C), and short time annealing are generated by a built-in heater.
I guess I know the reason it did not get into production ICs.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf