The amount of writes you can do on a SSD varies depending on type ... SLC, MLC, TLC , QLC from best to worse.
Very few SSDs are still MLC, majority of drives these days are either TLC or QLC
Examples of TLC drive
WD SN570 (I think it's 112 layer TLC memory chips) :
https://cdn.cnetcontent.com/syndication/feeds/wd/inline-content/C2/2D253893B5A85CE8F7AAFA3C5D3B0CF3E5D96C5B_PRODUCTBRIEFWDBLUESN570NVMESSD_source.PDFNote it's 150 TBW for the 250 GB model, 300 TBW for the 500 GB model and 600 TB for the 1 TB model.
Same ratings for the WD SN550, here's datasheet (which uses 96 layer TLC memory if my memory is correct) :
https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-blue-nvme-ssd/data-sheet-wd-blue-sn550-nvme-ssd-idk.pdfThe 112 layer memory is technically a bit worse, but they stretch the amount of writes by using bigger amount of write cache and a newer better controller - they switch some amount of TLC memory to pseudo-SLC mode.
Samsung 980 uses the latest TLC Samsung makes, also has 160/300/600 TB rating :
https://semiconductor.samsung.com/resources/data-sheet/Samsung_NVMe_SSD_980_Data_Sheet_Rev.1.1.pdfSamsung 980 has one of the biggest write caches, if there's enough free space it has up to around 60 GB of pseudo-SLC write cache (for the 1 TB model) ...
QLC memory is much worse .. ex Kingston NV1 drives are rated 240 TB for the 1 TB model, Samsung QVO drives are rated at around 360 TB for the same 1 TB model.
Sure, yes, you can actually write more than these estimated amounts, but some blocks of memory will wear out and the SSD will degrade.
And as for the M1 apple machines... thread reminded me of this, not sure if it was fixed or not :
https://twitter.com/marcan42/status/1494213855387734019?lang=enQuoting from the thread
Hector Martin
@marcan42
Well, this is unfortunate. It turns out Apple's custom NVMe drives are amazingly fast - if you don't care about data integrity.
If you do, they drop down to HDD performance. Thread.
9:35 AM · Feb 17, 2022·Twitter Web App
For a while, we've noticed that random write performance with fsync (data integrity) on Asahi Linux (and also on Linux on T2 Macs) was terrible. As in 46 IOPS terrible. That's slower than many modern HDDs.
We thought we were missing something, since this didn't happen on macOS.
As it turns out, macOS cheats. On Linux, fsync() will both flush writes to the drive, and ask it to flush its write cache to stable storage.
But on macOS, fsync() only flushes writes to the drive. Instead, they provide an F_FULLSYNC operation to do what fsync() does on Linux.
So effectively macOS cheats on benchmarks; fio on macOS does not give numbers comparable to Linux, and databases and other applications requiring data integrity on macOS need to special case it and use F_FULLSYNC.
How bad is it if you use F_FULLSYNC? It's bad.
Single threaded, simple Python file rewrite test:
Macbook Air M1 (macOS):
- flushing: 46 IOPS
- not: 40000 IOPS
x86 iMac + WD SN550 1TB NVMe (Linux):
- flushing: 2000 IOPS
- not: 20000 IOPS
x86 laptop + Samsung SSD 860 EVO 500GB SATA:
- flushing: 143 IOPS
- not: 5000 IOPS
So, effectively, Apple's drive is faster than all the others without cache flushes, but it is more than 3 times slower than a lowly SATA SSD at flushing its cache. Even if all you wrote is a couple of sectors. You pay a huge flush penalty if you do *any* writes.
Here, "flushing" on macOS means F_FULLSYNC and "not" means fsync(); on Linux both are fsync(), but "not flushing" is measured by telling Linux that the drive write cache is write-through (which stops it from issuing cache flushes).
Note that the numbers are filesystem-dependent (and encryption makes things more complicated); e.g. the SATA SSD numbers double on VFAT vs. my root filesystem (ext4 on LVM on dm-crypt), but the pattern is clear.
macOS doesn't even seem to try to proactively issue syncs; you can write a file on macOS, fsync() it, wait 5 seconds, issue a hard reboot (e.g. via USB-PD command), and the data is gone. That's pretty bad.
Of course, in normal usage, this is basically never an issue on laptops; given the right software hooks, they should never run out of power before the OS has a chance to issue a disk flush command. But it certainly is for desktops. And it's a bit fragile re: panics and such.
Unfortunately, this manifests itself as quite visible issues on Linux. For example, apt-get on Asahi Linux is noticeably slow. Making fsync() not really flush on macOS is not fair; lots of portable software is written to assume fsync() means your data is safe.
Our current thinking is we're going to add a knob to the NVMe driver to defer flush requests up to a maximum time of e.g. 1 second. That would ensure that a hard shutdown never loses you more than 1 second of data, which is better than what macOS can claim right now.
Alas, that's still not quite safe. Not flushing means we cannot guarantee ordering of writes, which means you could end up with actual data corruption in e.g. a database, not just data loss. There's no good way around this other than doing full flushes.
So the unfortunate conclusion is that if you're e.g. running a transactional database on Apple hardware, and you need to be able to survive a hard poweroff without data corruption, you're never going to get more than ~46 TPS.
Unless Apple improves their ANS firmware to fix this.
And for what it's worth, I inadvertently triggered a data consistency issue in macOS while testing this. Before running any tests I had GarageBand open. I closed it without saving the open project. After the first hard reboot later, it tried to reopen it and threw up an error.
So I guess the unsaved project file got (partially?) deleted, but not the state that tells it to reopen the currently open file on startup.
Data consistency matters.