Author Topic: Explaining large disparities in NVME SSD TBW endurances  (Read 6825 times)

0 Members and 1 Guest are viewing this topic.

Offline Wuerstchenhund

  • Super Contributor
  • ***
  • Posts: 3088
  • Country: gb
  • Able to drop by occasionally only
Re: Explaining large disparities in NVME SSD TBW endurances
« Reply #25 on: June 16, 2020, 08:17:45 am »
Quote
BTW I've seen plenty of Seagate HDD fail and have a few of them which developed bad sectors laying around. Also it's not like Seagate made one bad series and then it became OK. They repeated it several times. Backblaze buys any HDD they can get cheap, they will buy crap as long as cost benefits outweigh losses due to failure rate. IIRC they once made a statement about that.

Yes, and they also put lots of cheap consumer grade drives in large disk arrays where they are exposed to constant vibrations for which they were never designed for.

Those bad drives which I have were not exposed to anything, one HDD per office computer at my work.

Fair enough. I still question how relevant this is for making general statements about a brand's overall reliability.

For example, larger offices tend to buy business PCs from large vendors like HP and Dell, which often use drives that come with non-standard firmware. Also, if a drive fails, the replacement drive is often a refurb or even used drive, not a brand new one. The failure rate here has no relation to even the same original drive that's sold on the open market.

In addition, some desktop hard drives are specified for an average of 8 hrs use per day. If that's exceeded (which in offices it often is) then this may well impact the failure rate you see.

If the drives were bought on the open market, there are other factors which affect failure rate. Are the drives OEM or retail variants? If OEM, are they official or grey market? Are they all from the same batch or from different ones? Date codes? Places of manufacture? What were the firmware versions? And so on.

Just seeing x number of drives failing means zilch for how reliable a specific hard drive manufacturer's products are.
 

Offline Wuerstchenhund

  • Super Contributor
  • ***
  • Posts: 3088
  • Country: gb
  • Able to drop by occasionally only
Re: Explaining large disparities in NVME SSD TBW endurances
« Reply #26 on: June 16, 2020, 08:42:11 am »
TBW shouldn't be used (on its own) as an indicator of drive quality or endurance.

TBW is an indicator how long the drive will last as a minimum if it doesn't suffer from a defect. That's all it is.

Quote
It's largely a number made up by the manufacturers where they guarantee the drive against failure for that about of TB written to the NAND. If a drive falls short of that figure, it falls under their standard warranty and is replaced. It's a bit of a cost vs. benefit analysts on their part where through R&D and testing, they expect x% of drives to fail under warranty in a given period and factor that in to the price of their products. If they set that TBW figure too high, they might find themselves in a situation where more drives are deemed faulty towards the end of their usable life and starts eating into their profit margins, too low and it looks bad for marketing.

TBW isn't made up, it's a calculated figure based on the flash memory endurance and the amount of over-provisioning.

Exceeding the TBW rating isn't really a warranty issue for consumer grade drives which, despite their low TBW ratings, even in more demanding useage scenarios will need a lot more than 5 years (the currently longest warranty period) to even come close to that TBW figure. The overwhelming majority of warranty cases are defects, not exhausted write cycles. And most vendors require to run some diagnostics on a drive before exchanging it anyways, which would quickly uncover any exceedence of the drive's TBW or the temperature. If the drive is completely dead then manufacturers just replace, no matter how much TB have been written.

It's a bit different for enterprise drives which often come with endurance ratings in the PBW range. In write-intensive applications a normal economy or mixed-use SSD can quickly exceed it's PBW rating, however these are normally covered by enterprise level service agreements so they will be replaced anyways, no matter if the PBW rating is exceeded or not.

Quote
Take it with a grain of salt. NAND quality does vary and there are reasons why cheap Chinese drives are cheap.

Indeed.
 

Online wraper

  • Supporter
  • ****
  • Posts: 18066
  • Country: lv
Re: Explaining large disparities in NVME SSD TBW endurances
« Reply #27 on: June 16, 2020, 08:56:44 am »
Take it with a grain of salt. NAND quality does vary and there are reasons why cheap Chinese drives are cheap.
Chinese SSD often use NAND which is basically rejects. Due to large number of bad blocks or other issues. Normally highest grade goes into SSD, lower grade into memory cards. But NAND manufacturers also sell complete crap for cheap, often unmarked at all. And then crap not suitable even for memory cards goes into SSD.

In particular case Spectek NAND (micron division which recovers batches of rejects) has lines over it. Which means it's below even Spectek standards. Like package is not flat, missing balls, etc.
 
The following users thanked this post: Lenclume

Offline Wuerstchenhund

  • Super Contributor
  • ***
  • Posts: 3088
  • Country: gb
  • Able to drop by occasionally only
Re: Explaining large disparities in NVME SSD TBW endurances
« Reply #28 on: June 16, 2020, 09:00:50 am »
On fast enterprise NVMe drives like the Micron 9200 Series U.2 drives the flash memory gets so hot that the drive requires active cooling, despite having a metal housing with embedded cooling fins which acts as a heat sink. Without active cooling the drive literally gets too hot to touch during operation.
And where did you get the info that it's Flash that gets hot, not controller?

Thermal imaging of an open drive.

Quote
There is nothing exceptional about it's write speeds compared to regular NVMe SSD. IME on NVME SSDs while continuously writing at max speeds Flash ICs become simply warm while controllers easily run at 100oC

No, there's nothing exceptional about its write speed. It's after all just a figure for the max write rate without any statement of how long it can be sustained.

But enterprise-grade drives maintain their high write speeds long after many consumer-grade drives start to drop their intitially high write speed like a rock.

Quote
Quote
as soon as the flash memory reaches a certain temperature.

Except there is no temperature sensor for Flash normally.

Yup, they are on the board, thermally coupled to the memory IC.
« Last Edit: June 16, 2020, 09:07:01 am by Wuerstchenhund »
 

Offline Wuerstchenhund

  • Super Contributor
  • ***
  • Posts: 3088
  • Country: gb
  • Able to drop by occasionally only
Re: Explaining large disparities in NVME SSD TBW endurances
« Reply #29 on: June 16, 2020, 09:06:14 am »
Take it with a grain of salt. NAND quality does vary and there are reasons why cheap Chinese drives are cheap.

Chinese SSD often use NAND which is basically rejects. Due to large number of bad blocks or other issues. Normally highest grade goes into SSD, lower grade into memory cards. But NAND manufacturers also sell complete crap for cheap, often unmarked at all. And then crap not suitable even for memory cards goes into SSD.

Indeed. And often with zero over-provisioning and no (working) garbage collection in the firmware.

Which is why the vendors often don't list an exact capacity but just an estimate (as the capacity depends on how much of the memory is actually still working).

I'd rather buy a 2nd hand brand name drive than any of the Chinese crap-level drives.
 

Offline rrinker

  • Super Contributor
  • ***
  • Posts: 2046
  • Country: us
Re: Explaining large disparities in NVME SSD TBW endurances
« Reply #30 on: June 16, 2020, 03:01:55 pm »
 Another that gets good reviews and is near the top of the heap in performance is addlink. I used two different ones in my most recent builds. Since I didn't need super high speed in my server, I used a PCIe 3.0 one in the server, even though the MB supports PCIe 4.0 (AMD). For my desktop, I used the PCIe 4.0 model. Both readily available on Amazon.
 

Offline Electro Fan

  • Super Contributor
  • ***
  • Posts: 3321
 

Offline rrinker

  • Super Contributor
  • ***
  • Posts: 2046
  • Country: us
Re: Explaining large disparities in NVME SSD TBW endurances
« Reply #32 on: July 10, 2020, 03:45:30 pm »
 Besides being overpriced?

This is the one I went with: https://www.amazon.com/gp/product/B083DPYVGS/ref=ppx_yo_dt_b_asin_title_o02_s00?ie=UTF8&psc=1

Also no sense in a PCIe Gen 4 device unless you are using an AMD processor like my system. Even the newest Intel CPUs are still stuck at PCIe 3.0.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf