-
SD Card reliability in SPI mode
Posted by
betocool
on 25 Apr, 2024 08:39
-
Hey all,
more a general question than looking for a specific answer, but would like to know your experiences in this matter.
I'm using an NRF52840 (u-blox) connected to an SD Card using the SPI interface. Filesystem is FatFS as far as I can tell. Basically the file system and the SD interface are all set up automagically by Zephyr. Good or bad, no point arguing that, but comfortable.
This device writes at low speeds, 10 KBytes per second at most (so far) from a buffer, always trying to write out multiple of 512 bytes at a time. It's not too busy. After each write, I call the flush() or sync() function (don't have the code in front of me) to make sure the data persists on the SD card.
Data files can be displayed and then be retrieved over bluetooth, and that also works a treat, reading at about 500 kbps.
So far, so good.
In time, after much use (days? weeks?) the SD cards I've used seem to deteriorate. I can read them on Windows, it doesn't complain, but the micro either fails to init them, or sometimes it does init them but it can't read the file listing. Yet, if it inits them it seems to write data. Sometimes it can't do anything.
Swap the SD card and everything seems to work well again for a few days / weeks.
I still need to improve on a few things. When the micro is reset, the power to the SD card is off (no 3.3v), when a BLE connection is detected, the power to the 3.3V is turned on, card init'd and ideally all works. If I don't reset the unit explicitly, the 3.3V to the SD card would stay on indefinitely. That is something I need to fix. If not doing it's data logging, the unit just stays on basically. I'm sure I've covered most cases, it's unlikely that files are not closed after the micro's work is finished. Then again, I confess I have not tested all possible issues.
I wonder if you guys have experienced similar issues with SD cards using SPI? If so, if you were able to fix them or improve the reliability, what did you do?
Cheers,
Alberto
-
#1 Reply
Posted by
iMo
on 25 Apr, 2024 08:51
-
For proper handling of an sdcard while streaming data on it you may need a FIFO implemented. The sdcards have got so called "write latency", which may last for up to 200ms. During this time the sdcards do the housekeeping, like wear leveling etc. During the WL the sdcard is not accessible.
That 200ms is max as per the sdcard standard, afaik, and the duration of WLs is usually xx ms, but rather random occurrences.
It works without fifo as well but then you may need a specially prepared fs.
The fifo should be as deep as is your data rate and expected WL duration like.
-
#2 Reply
Posted by
faststoff
on 25 Apr, 2024 09:15
-
You have entered the terrible world of long-time SD-card reliability, betocool.
I have been down this path, and I will say that there are a lot of potential bumps you can hit. The behaviour you will see will likely also depend a great deal on the manufacturer of the SD-card. Remember, SD-cards contain a whole MCU and custom algorithms to manage the underlying NAND, custom to every manufacturer (this is usually referred to as the FTL - flash transition layer). Also, depending on where you got the card from, it might very well be a knock-off.
Could this be relevant to you
https://thecavepearlproject.org/2017/05/21/switching-off-sd-cards-for-low-power-data-logging/As it writes (and which was the case for me), removing power from the SD-card too soon after finishing the last write can cause data loss as the SD-card might do some internal shuffling. In my case, the updates to the FAT table on file length was the last thing written to the SD-card before the product shut down, causing the file system to become corrupt.
Specifically, the article it refers to a comment from Greimann (creator of SdFat). The comment
https://github.com/greiman/SdFat/issues/21The standard says reliably removing power is not supported in SPI mode. It does suggest that you can remove power one second after the card goes not busy but does not guarantee this will work. You can’t depend on isBusy() to power down a card. It only means the card can accept a command. It may still be programming flash or moving data for wear-leveling. You really need the one second delay after not busy.
Maybe something to look into?
-
#3 Reply
Posted by
fchk
on 25 Apr, 2024 13:52
-
I wonder if you guys have experienced similar issues with SD cards using SPI? If so, if you were able to fix them or improve the reliability, what did you do?
Don't use them. Or use them not so often.
Look here:
https://www.microchip.com/en-us/product/23lcv04mThis is a 512kByte SRAM with battery backup pin. Write your data into that memory. When it's full or half-full, turn on the card and copy the data to the card, and then turn off.
Or better: use raw flash. Like this one.
https://www.alliancememory.com/wp-content/uploads/pdf/flash/AllianceMemory_SPI_NAND_Flash_July2020_Rev1.0.pdfThere is no controller it it. Its the raw flash. You are responsible for all bad block management, wear levelling etc. But this memory is deterministic. You always know what it is doing. SD cards are black boxes. They even don't have fixed timings. They may stop working for half a second at any time if they please to do so.
So:
Use SRAM to collect samples
copy large blocks from SRAM to Flash.
-
#4 Reply
Posted by
Foxxz
on 25 Apr, 2024 14:54
-
Perhaps look into F-RAM or P-RAM
I don't think constantly calling sync() is helping the longevity of your SD card either. Maybe implement a poweroff button.
-
#5 Reply
Posted by
Miyuki
on 26 Apr, 2024 10:35
-
It seems like an implementation issue.
SD Card and Fat systems are fragile and must be written really carefully.
I have SD Card logging in one recent project, it buffers data internally and then flushes it in 5 to 10 second intervals to the card. It works fine for many weeks.
But as it does not have RTC or any date, but just a fixed value, it will totally crash if any modification is done to the files or even when they are simply deleted.
It uses FatFs library and when this happens it results in total garbage on the card.
-
#6 Reply
Posted by
DavidAlfa
on 26 Apr, 2024 20:44
-
Are you writing to the same area or adding more data to an existing file?
Most of SD cards don't have wear leveling, so if you're always erasing/writing to the same sectors the SD will wear out pretty fast.
There're some embedded-friendly filesystems supporting wear leveling, like
littlefs.
I don't think there's any reliability difference in SPI mode, however in that mode CRC is disabled by default, so it might lead to corruption.
So it's always a good idea to enable it when possible with
CMD59.
-
-
Yes. As to FAT, that's really not a great option as far as robustness goes. Sure it's a convenient way of exchanging data, but if you don't even need that ease of exchange, or can write a small utlity to read another format if needed, then using something more appropriate will also help mitigate corruption significantly. Such as LittleFS:
https://github.com/littlefs-project/littlefs
-
#8 Reply
Posted by
thm_w
on 27 Apr, 2024 00:11
-
OP: what brand and model of microsd card? Where was it purchased from?
I've used in a similar manner and card lasted for years. Power was always on unless the device was turned off. Which sounds the same as your scenario, but, not completely clear how often you are turning the device off.
Are you writing to the same area or adding more data to an existing file?
Most of SD cards don't have wear leveling, so if you're always erasing/writing to the same sectors the SD will wear out pretty fast.
There're some embedded-friendly filesystems supporting wear leveling, like littlefs.
Fairly sure any recent major brand microSD cards would have wear leveling implemented:
https://forums.sandisk.com/t/which-sandisk-cards-support-wear-leveling/34679/4But good idea to use something like littlefs, unless OP wants to implement a battery or supercap to guarantee safe shutdown.
-
#9 Reply
Posted by
abyrvalg
on 27 Apr, 2024 07:04
-
Also, check “industrial grade” sd cards. “Industrial” in their case is not just about wider operating temperature range, but also adds more robust internal algos and better memory types (MLC or even SLC in some smaller models).
-
#10 Reply
Posted by
betocool
on 28 Apr, 2024 02:21
-
Thanks for all your answers.
Looks like I can address a few things.
For this project, space is a premium, so additional memory is out of the question. I might consider a NAND flash but I've been down that road before in the past with less than stellar results too. That also requires major hardware changes and yes, time is also a premium!
The SD Cards I'm using, or tested really, are Verbatim brand, they are the most common here in OZ, 32GB. I tried the "Extreme" with a gray stripe, that seems to fare worst. I also tried the one with a golden stripe, that one performs better, but we've still had issues. I will later try out one of the Endurance series (I don't recall the name 100%, don't have the hardware with me) which are completely white, we'll see how that goes.
I will try synching less often, that might help, and I will also wait until there is really no data going on for a few hundreds of ms before turning the card off. As usual all these things take some time to implement but much longer time to test. We'll see.
I appreciate all your comments.
Cheers,
Alberto
-
#11 Reply
Posted by
Peabody
on 28 Apr, 2024 03:30
-
What's curious to me is that you can still read the card in Windows when it no longer works with the MCU. So is the card physically damaged, or what? Does reformatting it in Windows make it work again?
If you have an Arduino, or anything that can use the SdFat library, you might try running the SDformat example in that library on a card that has failed. It will completely erase the card, then format it to FAT32. You cannot do the erase from Windows. An alternative is to format the card in a DSLR camera that lets you do a "low level" format, which will also do the full erase.
If the card works perfectly after that, then either your code or your library is most likely corrupting it, although the timing of your power cycles could also be involved.
-
#12 Reply
Posted by
betocool
on 28 Apr, 2024 06:40
-
The card is not physically damaged, and yes, on one occasion a Windows format fixed the 'reading directory' issue. I haven't tried formatting the card using FatFS, that functionality is not yet implemented in the micro. Another thing I could try to "fix" the card. Not something I'm keen on before getting the data.
Cheers,
Alberto
-
#13 Reply
Posted by
TizianoHV
on 29 Apr, 2024 07:54
-
I've used a 2GB Kentron and a 8GB Kingston micro SD card + SdFat for my datalogger (Arduino nano every).
I wrote into the same cards for weeks, 50 times per second (2kBytes/s), multiple times without any issue.
Maybe smaller cards are better for such applications?
...
dataFile = SD.open(filename, O_CREAT | O_APPEND | O_WRITE); //Open file only once
...
while...{
...
dataFile.println(dataString);
SDbuffCount++;
if(SDbuffCount > SDbuffDim){
dataFile.flush(); //Flush, up to 5 times per seconds
SDbuffCount = 0;
}
}
dataFile.close();
-
#14 Reply
Posted by
Peabody
on 29 Apr, 2024 14:44
-
It sounds like you're pretty much locked into the software package, but for future reference there's an alternative that does all the file system setup ahead of time, including creating all the files and low-level erasing their data segments, and after that you just write data to successive sectors on the card. There's no need to mess with the directory entry or the FAT, or do any sync or flush operations. Or Open or Close the file for that matter. You are simply writing into the sectors that have already been defined as part of the file. The LowLatencyLogger example in the SdFat library shows how this can be done. I would think this approach might cause the least problems if you are cycling power to the SD card.
-
#15 Reply
Posted by
coppice
on 29 Apr, 2024 15:03
-
Fairly sure any recent major brand microSD cards would have wear leveling implemented: https://forums.sandisk.com/t/which-sandisk-cards-support-wear-leveling/34679/4
But good idea to use something like littlefs, unless OP wants to implement a battery or supercap to guarantee safe shutdown.
I wouldn't trust that to be true. Even if the data sheet says there is wear levelling it might be some crude implementation that doesn't help much. SD cards have a horrible history of wearing out when used for heavily reused areas, like swap space, where they may only last days. I was still seeing that happen 5 or 6 years ago. Many SD cards also fail completely after weeks or months, without particularly high usage. There is some evidence that this may be associated with particular products they are plugged into. For example, one of the Samsung Galaxys, I can't remember which, had a huge problem with dying SD cards. That doesn't seem to be your problem, though, as you indicated the cards become flaky rather than die completely. If they are wearing out, and write more reliably on one machine than on another, look at the power supply. Any setup with a droopy supply is likely to show up worn areas first. A rock steady supply will ensure they get pumped to the maximum.
-
#16 Reply
Posted by
thm_w
on 29 Apr, 2024 23:40
-
Fairly sure any recent major brand microSD cards would have wear leveling implemented: https://forums.sandisk.com/t/which-sandisk-cards-support-wear-leveling/34679/4
But good idea to use something like littlefs, unless OP wants to implement a battery or supercap to guarantee safe shutdown.
I wouldn't trust that to be true. Even if the data sheet says there is wear levelling it might be some crude implementation that doesn't help much.
I would 100% trust it to be true. Not having wear leveling on a major brand modern microSD would be crazy.
As mentioned in the link, if you want
guaranteed reliability then you'd choose the appropriate endurance/industrial branded card. Verbatim doesn't even make a a high endurance or 24/7 rated card that I see, its just standard low grade consumer stuff, which has no guaranteed operating life.
Relevant models: Kingston endurance, sandisk high/max endurance, samsung pro endurance, transcend high endurance, transcend or ADATA SLC, etc.
-
#17 Reply
Posted by
peter-h
on 30 Apr, 2024 15:01
-
I may be missing something simple, but surely talking to an SD card via the serial interface is no different to using say FatFS and talking to an SPI flash memory chip.
You will be limited by the flash endurance; typically 100k writes to the same block, plus other limitations like adjacent line interference which needs blocks to be periodically refreshed.
If you want a FAT file system (and you do if you also want it to look like a removable drive to windoze, via USB MSC device profile) then auto wear levelling in the flash media is the only way.
-
#18 Reply
Posted by
betocool
on 01 May, 2024 13:03
-
Wear leveling is something I'm not looking forward to implement. Time as usual is of the essence, and these troubles take a long time to troubleshoot.
If I can get and SD Card working with a format every few weeks, that would be a win. I will go and try some power up/down things, and add a few flags into my firmware that checks if write/read/mount are all OK. If not, the user should be able to format the SD card.
Cheers,
Alberto
-
#19 Reply
Posted by
DavidAlfa
on 01 May, 2024 13:28
-
You still didn't answer the main question. Are you deleting and using the same file again and again?
I suggest to not delete old files, just rename then to "_old_xxx", set as hidden, then create a new one.
Only once the SD is full or below a limit, delete all files and start over.
This will ensure storage wear is even.
-
#20 Reply
Posted by
Vojtech
on 01 May, 2024 16:08
-
Keep in mind that writing to SD slows down as the card fills up and when passing through certain addresses - 1 G etc. Try this test: Fill the SD card to about 90-95% in the PC with files with Your usuall size and then insert it into your device and start writing. SD cards are quite reliable, I use them in a similar mode - FatFS, SPI, 8051 uP for several years in continuous operation without any problems.
-
#21 Reply
Posted by
aeg
on 03 May, 2024 08:02
-
The cards are fine. They're reading on a PC. You need to step through the initialization and file listing process and find where and why it's failing.
-
#22 Reply
Posted by
betocool
on 04 May, 2024 00:38
-
DavidAlfa, I'm creating new files with each start of logging. There's no file that gets expanded. There's an option to delete the file if required, but it's the users choice. I tend to delete files more often, my colleagues while testing don't.
Cheers,
Alberto
-
#23 Reply
Posted by
coppice
on 04 May, 2024 11:15
-
The cards are fine. They're reading on a PC. You need to step through the initialization and file listing process and find where and why it's failing.
The fact that a card reads on a certain machine does not mean its fine. If there are cells approaching the marginal condition as they wear out, they may read and write OK on one machine and give errors on another with slightly different voltages, or decoupling quality. I don't think SD cards have an ability to scan for marginal flash cells in the way, say, some MCUs can do these days. So, you can't really do a performance analysis on a card, except, perhaps, by cooking the supply rail, and seeing how sensitive to operating conditions it is.
-
#24 Reply
Posted by
Peabody
on 04 May, 2024 14:12
-
I think it's still very much an open question whether the cards are going bad, per your original question, or something you or your library is doing is causing the problem. My money says there's probably nothing wrong with the cards, but I don't know how to isolate what's going on. Just keep in mind that the card itself knows nothing about the file system. It just reads from, erases, and writes to, sectors. Your library is resposible for all of the file system stuff. But I do wonder about the card's power supply situation. If I understand correctly, you haven't added the power cycling feature yet, so that's not the explanation for the corruption you've seen so far. In any case, there shouldn't be any sag on the card's 3.3V pin during writes, or excessive ripple at any point. Maybe I've just been lucky, but it hasn't been my experience that these cards just go bad within a couple weeks.