Author Topic: Backup SSD - how to keep data (Read 5608 times)

harerod · « **on:** October 06, 2023, 09:53:47 pm »

This thread prompted me to finally request your expertise about keeping data on backup SSD's:
https://www.eevblog.com/forum/general-computing/help-me-choose-qty6-sata-disks-for-my-multi-node-nas/msg5094024/#msg5094024

Background: I keep a double set of spare components for some critical computers. All software is legal and payed for. Since the activation servers have gone offline (Microsoft, Adobe, ...), I clone the system SSD's to allow for quick recovery in some fault scenarios.

Independent of SSD brand - what would be a good strategy to "exercise" the SSD's to keep their memory? I assume that occasional power-up and idling for several hours wouldn't do much. Under which conditions can one assume that the internal SSD controller would refresh/maintain the data? As was hinted in the other thread- maybe a read (Linux dd of the whole storage comes to mind)?
Another thing, but only a work-around, could be a dd-dump of the SSD onto a different disk and then trying to restore that. I will have to check if this would keep all licenses active.

Again, the specific question is - under which scenario is a backup SSD most likely to retain its data for several years?

golden_labels · « **Reply #1 on:** October 06, 2023, 11:55:03 pm »

The topic was touched here, but unfortunately that was never clarified.

Error detection and correction is an inherent part of data encoding in the SSD itself, so any read triggers it for the data being accessed. So reading the entire medium should suffice:

Code: [Select]

cat /dev/yourssd >/dev/nullOr, using pv to see progress:

Code: [Select]

pv -ctrabp /dev/yourssd >/dev/null
While this is guaranteed to work, it is suboptimal. It requires transferring all the data to RAM and it reads sectors in random order. SSD’s built-in mechanisms could do better. Which is why I asked for clarification in the other thread.

SiliconWizard · « **Reply #2 on:** October 07, 2023, 12:23:19 am »

I don't think it's really possible to answer this question in general, as it will depend to a large extent on the SSD's controller and its firmware. While they probably all use similar strategies to "refresh" content, the minute details of how and when they do that is not known to the public and likely varies quite a bit from one drive to the next.

Whether powering them on on a regular basis with no data access at all, with just basic filesystem access (so no particular action from the user, but plugged to a computer running some OS), or having to explicitely *access* the data area for which you want to maximize the odds of it being checked and refreshed if necessary, is enough, I have like zero answer to that. And as I just said, it probably depends on the controller's firmware to a large extent.

I doubt you actually need to explicitely access data on the drive to guarantee retention, otherwise any files that would be stored but rarely accessed on your SSD would run the risk of data corruption, and that would be like such a major downside to using SSDs that consumers would really not enjoy the idea. My guess is that most controllers probably check for different areas cyclically and transparently, so users don't have to access actually files to ensure retention. But I have never run into the source code of an SSD's firmware, so all bets are open.

golden_labels · « **Reply #3 on:** October 07, 2023, 02:23:12 am »

Not entirely sure if SiliconWizard’s answer is referring to my post or not. If not, ignore this.

Error detection and correction is not an “emergency” procedure,⁽¹⁾ but an inseparable part of data encoding. Which is why reading data must invoke verification and, if needed, refreshing. Reading the entire storage area will inevitably ensure its refreshing where needed. The problem is that, as I mentioned earlier, this method is not optimal.

What I wrote should not be read in opposite direction. Active reading requires refreshing, but refreshing doesn’t require active reading. During normal operation, while SSD is powered often and for extended periods of time, firmware may perform verification in the background. But in this case this is not going to work.

⁽¹⁾ In the distant past the perception was different, which is why we still encounter the opposite view persists today.

Georgy.Moshkin · « **Reply #4 on:** October 07, 2023, 03:22:49 am »

I always do full backup and delete the oldest one. I use 7zip with zero compression level (store) and noticed that copying speed is different for command line, GUI and when adding to archive through FAR manager. I've created a batch file that puts my work folder to ZIP on USB SSD drive, filename contains current date and time. Unfortunately batch process slows down to 60mb/s after some time, so I temporary switched to FAR manager add to archive option, which is surprisingly provides 100 to 200 mb/s speed and moves 100gb+ work folder to backup location in less than 15 minutes (same 7zip dll). I have three SSD drives and cycle through them from time to time to be safe. On fast CPUs you can try ZSTD compression. Archive can be tested using 7z checksums. I don't know method to maintain old backups other than simple copying.

DiTBho · « **Reply #5 on:** October 07, 2023, 07:49:47 am »

Quote from: SiliconWizard on October 07, 2023, 12:23:19 am

My guess is that most controllers probably check for different areas cyclically and transparently

this is most likely the case

I don't like SSDs because they need command extensions and precautions in the drivers (firmware on the SBC and kernel side) which frankly I don't have the desire or time to support

harerod · « **Reply #6 on:** October 07, 2023, 08:26:57 am »

Quote from: golden_labels on October 06, 2023, 11:55:03 pm

The topic was touched here, but unfortunately that was never clarified.
...Code: [Select]cat /dev/yourssd >/dev/nullOr, using pv to see progress:Code: [Select]pv -ctrabp /dev/yourssd >/dev/null...

Thanks for backing up my initial thoughts and for the Linux command sequences. I wonder if there are any built-in functions of Windows that would trigger the same behaviour. Maybe run checkdisk?
My fastest dedicated Linux box is a Raspi3 (with a new Raspi4 waiting to be set up), otherwise I run VMware images under Windows. While connecting the SSD's to the Raspi via a USB docking station should work, the performance would be an order of magnitude below the SSD's potential. (guesstimate: 1TB/30MB/s ~ 10h)

AndyBeez · « **Reply #7 on:** October 07, 2023, 09:24:38 am »

I assume you also store your mission critical SSDs in a fire resistant 'disc safe'?

DiTBho · « **Reply #8 on:** October 07, 2023, 09:29:50 am »

Quote from: golden_labels on October 07, 2023, 02:23:12 am

Not entirely sure if SiliconWizard’s answer is referring to my post or not. If not, ignore this.

Error detection and correction is not an “emergency” procedure,⁽¹⁾ but an inseparable part of data encoding. Which is why reading data must invoke verification and, if needed, refreshing. Reading the entire storage area will inevitably ensure its refreshing where needed. The problem is that, as I mentioned earlier, this method is not optimal.

What I wrote should not be read in opposite direction. Active reading requires refreshing, but refreshing doesn’t require active reading. During normal operation, while SSD is powered often and for extended periods of time, firmware may perform verification in the background. But in this case this is not going to work.

⁽¹⁾ In the distant past the perception was different, which is why we still encounter the opposite view persists today.

yup, that old trick is sufficient but not necessary; in my opinion it's likely that there are some firmware commands, or proprietary extensions, which internally invoke { reading, verification, refreshing }, but done in an optimal way.

Here we need to understand if the documentation is public

Jackster · « **Reply #9 on:** October 07, 2023, 10:33:54 am »

I use AOMEI Backupper to create backup files of any drives like this.
Just restored an SSD for a machine this morning using it.

Close to 1:1 in terms of backup so a 20GB Windows install with software is only ~20GB file.
Of which you can store anywhere including the "cloud" so no need to worry about physical media.

David Hess · « **Reply #10 on:** October 08, 2023, 11:40:44 am »

My current backup strategy includes writing changed files to the backup SSD and removing deleted files so the content of both volumes is the same, and then creating a hash file of the source, and checking the entire backup with that hash file, so all content and metadata get read on both the source and backup.

In theory this will find (and report) any corrupted sectors on both, and on an SSD, it should force the scrub-on-read operation if a sector is failing.

Shonky · « **Reply #11 on:** October 08, 2023, 11:47:31 am »

Quote from: David Hess on October 08, 2023, 11:40:44 am

My current backup strategy includes writing changed files to the backup SSD and removing deleted files so the content of both volumes is the same, and then creating a hash file of the source, and checking the entire backup with that hash file, so all content and metadata get read on both the source and backup.

In theory this will find (and report) any corrupted sectors on both, and on an SSD, it should force the scrub-on-read operation if a sector is failing.

What happens if a file is deleted or corrupted and you don't notice before a backup occurs? This sounds more like a manual RAID1.

wraper · « **Reply #12 on:** October 08, 2023, 12:00:32 pm »

For leaving with no power for long term, you want as minimum cell levels as possible. SLC would be perfect, but not really an option these days. So it should be MLC or at least no worse than TLC. Also it should be a brand new drive or with minimal use. With cell wear retention time drops drastically. Also you should ensure that SSD uses robust ECC mechanism such as LDPC.

David Hess · « **Reply #13 on:** October 08, 2023, 12:26:31 pm »

Quote from: Shonky on October 08, 2023, 11:47:31 am

Quote from: David Hess on October 08, 2023, 11:40:44 am
My current backup strategy includes writing changed files to the backup SSD and removing deleted files so the content of both volumes is the same, and then creating a hash file of the source, and checking the entire backup with that hash file, so all content and metadata get read on both the source and backup.

In theory this will find (and report) any corrupted sectors on both, and on an SSD, it should force the scrub-on-read operation if a sector is failing.

What happens if a file is deleted or corrupted and you don't notice before a backup occurs? This sounds more like a manual RAID1.

It is like RAID1, but the copy can be kept offline. A better backup system would be to archive changed files after a full backup which is what I used to do, but that is impractical when backing up from one 2TB volume to another 2TB volume, or where there are large changes on the original volume.

In the future I may be backing up my portable 2TB SSDs to my workstation's huge RAID volume instead of other portable 2TB SSDs.

harerod · « **Reply #14 on:** October 08, 2023, 12:29:36 pm »

Picking up from golden_label's hint:
I run an older OS version on my Raspi3 (which is usually blocked from going online in the router and runs some rather specific local tasks), so I couldn't install the pv command. The Raspi3 only has USB2 ports, so we could run "sudo cat /dev/sda >/dev/null" at about 30MByte/s and wait for 10 hours, while watching CPU load with "htop".

This prompted me to MacGyver an external docking station to a spare headless Raspi4.
Sequence to set things up: - "sudo apt-get update" - "sudo apt-get install pv"

Get things running:
- plug in external drive
- check with "df -h"
- if not mounted, check "sudo fdisk -l"
- run read "sudo pv -ctrabp /dev/sda >/dev/null" <- this is processing an 1TB EVO870 while I am typing these lines. USB2 connection yields 30MByte/s, USB3 ~288MByte/s

Wrapping things up:
- "sync"
- "sudo umount /dev/sda"
- cross fingers that this voodoo worked

Next step: dump that SSD into a file on another drive, then restore from file to another SSD. Would be great if that worked. My doubt is not about Linux dd, but due to my lack of knowledge regarding the various copy protection themes. At any rate, this thread is about figuring out maintaining data retention on SSD's.
Update: pv dumped that 1TB SSD within 50min @ 286MiB/s. Right now I am preparing an HDD to accept the SSD's content.

DiTBho · « **Reply #15 on:** October 08, 2023, 01:04:50 pm »

Quote from: wraper on October 08, 2023, 12:00:32 pm

For leaving with no power for long term, you want as minimum cell levels as possible. SLC would be perfect, but not really an option these days. So it should be MLC or at least no worse than TLC. Also it should be a brand new drive or with minimal use. With cell wear retention time drops drastically. Also you should ensure that SSD uses robust ECC mechanism such as LDPC.

ah, if Solid State Drives used ferromagnetic memory, and I had stable data on its implementation, then... I would be willing to invest in SSD/FeRAM

DiTBho · « **Reply #16 on:** October 08, 2023, 01:07:18 pm »

Quote from: harerod on October 08, 2023, 12:29:36 pm

I couldn't install the pv command

if you have Gcc&C, you can download the source, configure, and compile, as I did yesterday.

harerod · « **Reply #17 on:** October 08, 2023, 01:36:05 pm »

DiTBho, good to know, I'll keep that in mind, thanks. Last time I dived so deeply into Linux, was when I had to port drivers to kernel 2.4 for an embedded board that we had designed. Since then I have been a bit out of the loop. I love Linux for low level stuff, like what we are discussing here. Most of the time I run virtual machines. This was the first time I fired up that Raspi4 (which I only got a couple of months ago). I found the USB3 performance delightful.

harerod · « **Reply #18 on:** October 08, 2023, 03:52:50 pm »

The only large HDD ready at hand is NTFS formatted. NTFS write support is done by a usermode driver: "sudo apt-get install ntfs-3g".
As it turns out, write speed for the Raspi4 from SDD to the NTFS device is about 30MiB/s max, no matter if dd or pv is used. Directly piping through gzip results in an average write speed of about 7MiB/s.
Since this is going to be a long operation, running well over night, I decided to use the full CPU and do both writes at the same time.

Next step is dumping the data back to an SSD, to see if it will work without issues. Again, not exactly what I was looking for, when I started this thread, but maybe a viable workaround for my task at hand.
And incidentally that source SSD gets exercised as a side effect...

harerod · « **Reply #19 on:** October 12, 2023, 03:09:20 pm »

Alright, what have we learned in this thread?

- Without detailed manufacturer information, we cannot know how an SSD works internally. Therefore our best bet is to make the SSD take a close look at itself and let it figure out the charge levels in its memory cells. This can be done most easily under a Linux environment. Using its USB3 ports, a Raspi4 will read "sudo cat /dev/sda >/dev/null" an Samsung EVO 870 in an Inateck FD1003 docking station at about 288MB/s.

- since we want to read the whole SSD, we can use this opportunity to dump its complete contents into an image file on a regular backup media, in my case large HDD's. Later I wrote the image back and had all copy protections/licenses intact. Image integrity can be checked via a checksum tool, e.g. md5sum or sha256sum.
Data transfer speeds showed large variations:
sudo pv /dev/sda > /mnt/usbhdd/ss_pv.img to an NTFS formatted Seagate 5TB USB HDD: ~30MB/s, with gzip ~7MB/s
sudo pv /dev/sda > /mnt/usbhdd/ss_pv.img to an EXT4 formatted Seagate 2TB USB HDD: ~80..100MB/s (limited by HDD transfer speed), with gzip ~7MB/s(!)

Restore from EXT4 Media:sudo dd if=/mnt/usbhdd/ssd.img of=/dev/sdb bs=10MB conv=noerror,sync status=progress - again, mostly limited by HDD transfer speed to ~88MB/s

Update: md5sum crunched that 1TB image in less than three hours ~123MiB/s. Looks like the HDD speed was the limiting factor.

golden_labels · « **Reply #20 on:** October 13, 2023, 04:23:22 am »

There is a mistake in the dd invocation. One that may destroy your data. This is one of the two primary reasons I so strongly drive people away from using Data Destroyer until actually needed: notorious misunderstanding of tool’s operation and its arguments. dd is a prime example of magical thinking among *nix folk. Not leading to catastrophes only thanks to lucky coincidences.

The sync in the above invocation is randomly adding runs of zero bytes in data being copied.⁽¹⁾ You might have meant the sync oflag (not conv), but for this bs is wrong. It is specified in megabytes instead of mebibytes. While it will not prevent data from being written, metric units will not be aligned to sector sizes (or erasure block sizes on flash media). That will lead to increased wear and may cause lower performance.⁽²⁾ But to make bs being respected in a way suitable for sync, one also has use GNU’s dd and its non-standard fullblock iflag. With all these options set correctly, larger block sizes are still preferred to not disrupt storage medium’s internal caches and (in flash) give allocation algorithms some more air. They are not designed to see unneeded, frequent flushes.

I’m not an expert on sotrage technology. Perhaps DiTBho could provide more insight in this end of the issue. I am also expecting I missed half of the problems dd may cause in this scenario.

⁽¹⁾ The technical definition of this flag is padding incomplete blocks of data with zeros before passing them to the write syscall. But without right understanding of kernel handling of specific devices and carefully chosen arguments it becomes random. And it never makes sense in restoring images.
⁽²⁾ “May” because it’s kernel, not dd, that does the I/O. Kernel’s I/O strategy is likely to provide enough of cushion to make disruptions from dd negligible.

harerod · « **Reply #21 on:** October 13, 2023, 08:05:37 am »

Thanks for the warning. Unless I made a typo (see screenshots), these commands led to a functioning Win7 setup, comprising three partions on that SSD. I'd gladly accept any expert input on a better command syntax.
A couple of decades ago I wrote Linux 2.4 kernel drivers. I'd never even try to pretend to understand that OS. All my work is done by reading man pages and then actual operation reports on the internet.

wraper · « **Reply #22 on:** October 13, 2023, 08:15:41 am »

You should take into account that while reading SSD you waste its time it could spend on housekeeping. So if you read it all and then turn off, it's worse than just leaving it powered and doing nothing for the same amount of time.

golden_labels · « **Reply #23 on:** October 13, 2023, 08:31:00 am »

Regarding wraper’s message above
(Added after the failure to answer this)
Before following wraper’s advice and risking data loss, it may be worth seeing the exchange in this thread. Pay close attention to relevance of answers to the topic, to what they are being asked about, and to what their claim would entail. The answers give impression of being valid, because they contain facts. But these facts are disconnected from the statements they are supposed to support. — 2023-10-14; the rest is the original post from Friday.

Quote from: harerod on October 13, 2023, 08:05:37 am

Thanks for the warning. Unless I made a typo (see screenshots), these commands led to a functioning Win7 setup, comprising three partions on that SSD.

Possible. I answered this in the last sentence of the first paragraph.

Quote from: harerod on October 13, 2023, 08:05:37 am

I'd gladly accept any expert input on a better command syntax.

The entire second paragraph of my message. Of course it’s better to not take the risk of using data destroyer in the first place.

One of the things I forgot is, that writing outside of a file system may require invoking BLKFLSBUF ioctl on the target device (e.g. using blockdev --flushbufs). sync forces sending data to the device, but I recall there were some issues with flushing device’s internal buffers, as also mentioned by Jérôme Pouiller. Unfortunately I no longer recall the details and an attempt to refresh my memory with search engines lands me in discussions regarding filesystems, not devices.

Quote from: wraper on October 13, 2023, 08:15:41 am

You should take into account that while reading SSD you waste its time it could spend on housekeeping. So if you read it all and then turn off, it's worse than just leaving it powered and doing nothing for the same amount of time.

Finally an answer. However, that response suggests that the firmware is doing refreshing 100% of the time, if the SSD is unused. Is this true?

wraper · « **Reply #24 on:** October 13, 2023, 08:57:55 am »

Quote from: golden_labels on October 13, 2023, 08:31:00 am

Finally an answer. However, that response suggests that the firmware is doing refreshing 100% of the time, if the SSD is unused. Is this true?

What and how it exactly does only manufacturer knows. But it's very unlikely it will do housekeeping while reading large amounts of data since it will reduce performance a lot. And I'm almost sure it won't erase cells to free them for writing another data as it's much slower than just writing. It cannot be 100% of the time as there is only so much to do. But from what I've seen SSD needs to be left alone to do garbage collection and stuff to restore performance after filling it with data. Likely the same with refreshing old data.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Backup SSD - how to keep data (Read 5608 times)

Share me