Author Topic: 2025 replacement for Samsung 840 evo SSD (high reliability embedded use)  (Read 5755 times)

0 Members and 2 Guests are viewing this topic.

Offline The SoulmanTopic starter

  • Super Contributor
  • ***
  • Posts: 1100
  • Country: nl
  • The sky is the limit!
Years ago I would have used a intel SLC industrial ssd.
Any modern high reliability equivalent?
I've heard of qlc or tlc ssd's that could be switched to (much lower capacity obviously) slc by different firmware?
Can't find any info anymore.

Any recommendations? 120GB is plenty and ideally sub €500,- Os is windows embedded.
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8921
I've heard of qlc or tlc ssd's that could be switched to (much lower capacity obviously) slc by different firmware?
Can't find any info anymore.
https://theoverclockingpage.com/2024/05/13/tutorial-transforming-a-qlc-ssd-into-an-slc-ssd-dramatically-increasing-the-drives-endurance/?lang=en
 
The following users thanked this post: thm_w, The Soulman

Offline wraper

  • Supporter
  • ****
  • Posts: 18884
  • Country: lv
Unless you actually are going to make so many writes that you need write endurance increase, or use it for long term unpowered storage, IMHO just get a decent (not DRAM-less) TLC drive and leave the firmware alone rather that run it with firmware of questionable origin and quality. Most of SSD failures occur not because of actual NAND wear but because FW craps out and locks the drive anyway.
« Last Edit: April 13, 2025, 02:45:51 am by wraper »
 
The following users thanked this post: The Soulman

Offline Infraviolet

  • Super Contributor
  • ***
  • Posts: 1260
  • Country: aq
Western Digital's "red" branded SSDs of the SA500 type might be fairly good? They're designed for use within servers, so should have more of a longevity focus than typical consumer SSDs. And although they're "red" they are not designed to run in the classical NAS fashion where they would quickly give up in the event of a read error rather than retrying, these are sold as "red" but still behave like a normal drive where data quality is prioritised above avoiding delay. Unlike a lot of SSD models (every manufacturer has had some bad ones) there aren't any reports of known firmware bugs for this model.
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8921
Unless you actually are going to make so many writes that you need write endurance increase, or use it for long term unpowered storage, IMHO just get a decent (not DRAM-less) TLC drive and leave the firmware alone rather that run it with firmware of questionable origin and quality. Most of SSD failures occur not because of actual NAND wear but because FW craps out and locks the drive anyway.
The SLC mod for SM drives uses the original firmware and production tools, just sets it to SLC mode which was always there in the form of SLC cache.
 

Offline EEVblog

  • Administrator
  • *****
  • Posts: 41126
  • Country: au
    • EEVblog
Wouldn't the various computer Youtubers be all over this stuff? e.g. LTT, Gamers Nexus etc
 

Online Berni

  • Super Contributor
  • ***
  • Posts: 5220
  • Country: si
The tech Youtubers don't care about this so much since it doesn't meaningfully affect the read/write speed to put a drive in SLC mode. Read speed is unaffected (and is usually limited by the interface speed anyway for fast drives) while write speed is only affected if you write a very large amount of data to overflow the SLC cache.

That being said SLC is not a magic bullet. The modern flash chips have much smaller cells so that they can pack more of them onto a chip. This means each cell has a smaller bucket of charge storing the data and so more easily leaks or gets flipped. So a old low capacity TLC capable chip is likely more reliable than a modern high capacity chip running in SLC
 

Offline David Hess

  • Super Contributor
  • ***
  • Posts: 18745
  • Country: us
  • DavidH
Swissbit makes fully specified SLC and high endurance drives.  I see an MLC 128G drive for $264.85, so they are not cheap.

https://www.swissbit.com/en/products/nand-flash-products/2-5-sata-ssd/

Wouldn't the various computer Youtubers be all over this stuff? e.g. LTT, Gamers Nexus etc

I have seen very little coverage about SSD reliability and endurance for years.
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4381
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Wouldn't the various computer Youtubers be all over this stuff? e.g. LTT, Gamers Nexus etc
I don't think they have the in-depth knowledge to talk about this. They are mostly repeating press releases.
Can't blame them though. They are journalists after all.

You can buy enterprise disks, eg: Micron. I mean for example the Micron 5400 as max type supposedly has "Up to 23,827TB" TBW.
Or if you want to go really fancy, stuff like swissbit. You will feel that though, at >€1 per GB.

But I don't think an embedded windows is going to wear one out though. Maybe it's not worth the price.

Our industrial pc vendor fits oem Phison as far as I can see on the invoices.
I have seen some SSD failures in the field on our machines, but I don't think this was due to the flash being worn out. I suspect mechanical failure of solder joints. You could monitor it with HD Sentinel if you want.

Anyway, if you buy larger there are more spare blocks?

Regardless of promised quality, they can still fail, please do backups. Not only to cover hardware issues.
 

Online Berni

  • Super Contributor
  • ***
  • Posts: 5220
  • Country: si
In embedded systems id think bitrot would be a bigger concern than exceeding the write cycles. Even the cheapest crappy QLC SSDs can survive being completely overwritten >100 times if not >1000 times.

One would expect these SSDs to write a correction whenever a sector read encounters an ECC error. So perhaps a better solution is to have some software running on it that slowly reads across areas of the SSD in the background so that the SSD has a chance to scan for bit flips and correct them before too many bit flips accumulate for the ECC code to successfully recover.
 

Offline EEVblog

  • Administrator
  • *****
  • Posts: 41126
  • Country: au
    • EEVblog
Regardless of promised quality, they can still fail, please do backups. Not only to cover hardware issues.

Wouldn't you just run a RAID array? Many BIOS support that directly.
 

Offline wraper

  • Supporter
  • ****
  • Posts: 18884
  • Country: lv
So perhaps a better solution is to have some software running on it that slowly reads across areas of the SSD in the background so that the SSD has a chance to scan for bit flips and correct them before too many bit flips accumulate for the ECC code to successfully recover.
It's a non-solution, creating problems for nothing. Modern SSDs do this when idle anyway. All you need to prevent bit rot is keep it powered often enough.
 

Offline wraper

  • Supporter
  • ****
  • Posts: 18884
  • Country: lv
Regardless of promised quality, they can still fail, please do backups. Not only to cover hardware issues.

Wouldn't you just run a RAID array? Many BIOS support that directly.
RAID is not a magic bullet and is nowhere close to having actual backups. Now to say HW failure is far from being the only way how you can lose your data.
 

Offline EEVblog

  • Administrator
  • *****
  • Posts: 41126
  • Country: au
    • EEVblog
Regardless of promised quality, they can still fail, please do backups. Not only to cover hardware issues.

Wouldn't you just run a RAID array? Many BIOS support that directly.
RAID is not a magic bullet and is nowhere close to having actual backups. Now to say HW failure is far from being the only way how you can lose your data.

Of course, but from an operational embedded system point of view it's going to add a huge amount of redundancy if a drive failure is your concern.
If I was desgining a critical embedded system I'm going to want to run a RAID O/S disk.
This forum server for example runs a RAID system in order to ensure a disk failure doesn't bring it down.
« Last Edit: April 14, 2025, 10:21:34 am by EEVblog »
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4381
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
RAID could be nice. But the amount of embedded computers offering this isn't that big.
For example, I have selected a new QBiX recently, and it only fits one m.2 for storage.

Then you would also need nvme hardware raid. I don't know if that is a thing.
Plus, you then only made storage redundant, nothing else. Not even ECC DRAM, which is also not standard.

I don't think it's worth it. Cheaper to have a spare pc. It's running windows, so that would be the most fragile bom item.

If you network boot them, they have no storage at all.
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8921
The tech Youtubers don't care about this so much since it doesn't meaningfully affect the read/write speed to put a drive in SLC mode. Read speed is unaffected (and is usually limited by the interface speed anyway for fast drives) while write speed is only affected if you write a very large amount of data to overflow the SLC cache.
The majority of tech Youtubers have never cared about reliability beyond the warranty anyway. They're just marketing mouthpieces at this point.
That being said SLC is not a magic bullet. The modern flash chips have much smaller cells so that they can pack more of them onto a chip. This means each cell has a smaller bucket of charge storing the data and so more easily leaks or gets flipped. So a old low capacity TLC capable chip is likely more reliable than a modern high capacity chip running in SLC
TLC needs to distinguish between 8 voltage states, vs 2 for SLC. TLC was never specified for more than ~3k cycles per block at most, modern high-density SLC is around 30k-60k.

That said, the Samsung 840 EVO mentioned in the title of this thread is a really low bar; it had notoriously bad planar TLC flash with retention measured in weeks at normal room temperature:

https://goughlui.com/2024/07/20/salvage-tested-an-elderly-forgetful-120gb-samsung-840-evo-ssd/
https://forum.acelab.eu.com/viewtopic.php?t=8735
 

Offline EEVblog

  • Administrator
  • *****
  • Posts: 41126
  • Country: au
    • EEVblog
That said, the Samsung 840 EVO mentioned in the title of this thread is a really low bar; it had notoriously bad planar TLC flash with retention measured in weeks at normal room temperature:

I just installed a 2TB 990 EVO, hope it's good.
 

Online Berni

  • Super Contributor
  • ***
  • Posts: 5220
  • Country: si
The newer Samsung SSD had some problems with rapidly eating up the spare sector area (makes the reported health % decline). But this was apparently just a firmware problem in the controller, so just make sure to update it to the latest firmware (Most people don't do that because you need to install some bloatware piece of software to actually do the update)
 

Offline EEVblog

  • Administrator
  • *****
  • Posts: 41126
  • Country: au
    • EEVblog
The newer Samsung SSD had some problems with rapidly eating up the spare sector area (makes the reported health % decline). But this was apparently just a firmware problem in the controller, so just make sure to update it to the latest firmware (Most people don't do that because you need to install some bloatware piece of software to actually do the update)

Samsung Magician says I'm on the latest firmware
 

Offline JohanH

  • Frequent Contributor
  • **
  • Posts: 784
  • Country: fi

Some tests indicate that TLC drives can become corrupt during storage:

https://www.tomshardware.com/pc-components/storage/unpowered-ssd-endurance-investigation-finds-severe-data-loss-and-performance-issues-reminds-us-of-the-importance-of-refreshing-backups

This is not a surprise, as this has been suspected. But further tests are of course needed.
 
The following users thanked this post: Berni

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4381
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Retention of high density high speed ssd is not that good. Do not use them for archival storage.

You can't have all: speed, density, retention.
 

Offline 5U4GB

  • Super Contributor
  • ***
  • Posts: 1273
  • Country: au

Some tests indicate that TLC drives can become corrupt during storage:

https://www.tomshardware.com/pc-components/storage/unpowered-ssd-endurance-investigation-finds-severe-data-loss-and-performance-issues-reminds-us-of-the-importance-of-refreshing-backups

I don't know how much you can get from that, skimmed through it earlier and a summary seems to be "worn-out no-name Chinese SSDs are garbage", which isn't much of a surprise.
 

Offline InductorX

  • Contributor
  • Posts: 16
  • Country: be
I would look for an enterprise SSD/NVMe if longevity is important, those are usually geared towards just that. Higher TBW, MTBF and DWPD are important, ECC and TRIM, usually all these features are more prevalent on enterprise drives. That being said, they are usually a bit more expensive to buy. I usually buy them through enterprise hardware refurbishers, they have either very low used ones or have bought bulk unused ones through companies. The better ones mention how much hours of use they had.

Interesting info around longevity: https://www.enterprisestorageforum.com/hardware/ssd-lifespan-how-long-will-your-ssd-work/
 

Offline tooki

  • Super Contributor
  • ***
  • Posts: 14753
  • Country: ch
Regardless of promised quality, they can still fail, please do backups. Not only to cover hardware issues.

Wouldn't you just run a RAID array? Many BIOS support that directly.
RAID is not a magic bullet and is nowhere close to having actual backups. Now to say HW failure is far from being the only way how you can lose your data.

Of course, but from an operational embedded system point of view it's going to add a huge amount of redundancy if a drive failure is your concern.
If I was desgining a critical embedded system I'm going to want to run a RAID O/S disk.
This forum server for example runs a RAID system in order to ensure a disk failure doesn't bring it down.
Precisely. RAID is not a backup, it’s for improving availability.*

*Except for RAID 0, which dramatically reduces availability. And yes, for extremely high-throughout applications like uncompressed UHD video, striped arrays are used.
 

Online Berni

  • Super Contributor
  • ***
  • Posts: 5220
  • Country: si
I don't know how much you can get from that, skimmed through it earlier and a summary seems to be "worn-out no-name Chinese SSDs are garbage", which isn't much of a surprise.

This is a high density flash problem in general.

It has been known that SSDs are a bad idea for archival storage because data slowly leaks away out of the cells. However this gets worse as they make the cells smaller to pack in more of them. Less leakage needed to flip a bit, packing more levels into a cell makes it even worse.

This time used to be a fair number of years, but according to the article these things are now getting into low single digit years.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf