Author Topic: EEPROM corruption on AVR  (Read 4304 times)

0 Members and 1 Guest are viewing this topic.

Offline OaklanderTopic starter

  • Regular Contributor
  • *
  • Posts: 56
  • Country: 00
EEPROM corruption on AVR
« on: March 05, 2024, 07:05:06 am »
I need to make EEPROM in Attiny202 corruption proof to save a configuration variable.
To do that Microchip advises to enable brown-out detection and somewhere else I've found that it's a bad practice to use the first address in EEPROM.

My question is how corruption proof will the EEPROM be if I implement the measures above? Will it then reliably retain the data for years or even decades?
 

Online Psi

  • Super Contributor
  • ***
  • Posts: 9953
  • Country: nz
Re: EEPROM corruption on AVR
« Reply #1 on: March 05, 2024, 07:11:12 am »
It's pretty reliable as long as you don't write to the same address 100,000's of times and wear it out.
Data retention from datasheet is 40 years at 55°C and 100,000 writes.   
People have tested AVR's writing data in loops and it took a few million writes of a location before they saw any issues.

As long as you have the brownout detector active I'd be more worried about a coding bug causing excessive writes than I would be about the flash failing.

But you could always implement some sort of software error detection/checking where you store data in 3 different locations spread over the eeprom and then you can check them all to confirm the data agrees.  But you do have to consider the situation if you lose power after writing some of those areas but not all. Maybe a flag to say you were in the middle of a write. If you know that then you can finish the write because you know the order you wrote them so you know which one is correct.

Can also store a checksum with a block of data to confirm when you read it back all bits are correct.

I guess it boils down to how critical is this data, if its wrong does someone die?
If you can detect that the data has been corrupted can the device just refuse to function?
Or does it NEED to be correct at startup every time.
« Last Edit: March 05, 2024, 07:20:46 am by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)
 

Offline kripton2035

  • Super Contributor
  • ***
  • Posts: 2587
  • Country: fr
    • kripton2035 schematics repository
Re: EEPROM corruption on AVR
« Reply #2 on: March 05, 2024, 07:17:21 am »
or you can use a small fram chip that will hold the data for centuries...
 

Offline OaklanderTopic starter

  • Regular Contributor
  • *
  • Posts: 56
  • Country: 00
Re: EEPROM corruption on AVR
« Reply #3 on: March 05, 2024, 07:21:07 am »
Excessive writes should not be an issue as the data will be written only once. I'm also pretty sure my code doesn't do any unintended writes.

Data redundancy and automatic error detection and correction is something I have thought about. But on the other hand it also looks like a possible source for coding errors.

Checksum doesn't really help anything in my application.
 

Online Psi

  • Super Contributor
  • ***
  • Posts: 9953
  • Country: nz
Re: EEPROM corruption on AVR
« Reply #4 on: March 05, 2024, 07:23:55 am »
Excessive writes should not be an issue as the data will be written only once. I'm also pretty sure my code doesn't do any unintended writes.

Until a cosmic ray flips a bit and turns the next instruction into a loop that jumps back.
Better enable the watchdog time too :)

I contain my flash/eeprom write stuff into a function and inc a counter whenever the function gets run.
It allows me to keep an eye on this number just in case.

Also, for AVRs, in the EEPROM library there are WRITE functions and also UPDATE functions.
The Update functions are safer as they do a read first to check if the data has changed.  If you try to write the same value it skips the write.
« Last Edit: March 05, 2024, 07:30:44 am by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)
 
The following users thanked this post: audiotubes

Offline OaklanderTopic starter

  • Regular Contributor
  • *
  • Posts: 56
  • Country: 00
Re: EEPROM corruption on AVR
« Reply #5 on: March 05, 2024, 07:39:47 am »
So actually the safest thing would be to not inlcude any EEPROM writes in the code and program the EEPROM at the same time with the program memory.
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4078
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: EEPROM corruption on AVR
« Reply #6 on: March 05, 2024, 07:46:38 am »
Corruption proof is a hard requirement.

The bare minimum should be that you can detect corruption. So add a suitable CRC or similar.
Maybe you also would need to recover corruption, use ecc, or store multiple copies, on different places of the eeprom.
Then you should ensure your software does not write or erase to eeprom when the device isn't truly ready and has time to do so.
You could also make a transaction log such that you always keep the old settings when writing new ones.

Chip brownout protection shouldn't really be an issue unless there is some hardware errata.
What you should do is ensure you do not write to eeprom when there is risk. For example, if you log power-ups, wait a few seconds until you're sure the device is on, on.
Often the code is run a short while when programming or during testing and immediately cut off.
 

Offline OaklanderTopic starter

  • Regular Contributor
  • *
  • Posts: 56
  • Country: 00
Re: EEPROM corruption on AVR
« Reply #7 on: March 05, 2024, 08:02:05 am »
The setting will only be written once. It could be hard coded in the firmware but there are dozens of different values for the setting so I'm using the EEPROM to prevent having so many different versions of the firmware.
 

Offline HwAoRrDk

  • Super Contributor
  • ***
  • Posts: 1478
  • Country: gb
Re: EEPROM corruption on AVR
« Reply #8 on: March 05, 2024, 11:16:27 am »
In that case the best thing to do would be as you already mentioned: write the EEPROM only at programming time, don't include any code in your firmware that can write to the EEPROM, and add a CRC to the data which gets verified at startup. If verification fails, fall back to some hard-coded values or enter a fail-safe fault state.
 

Offline Kleinstein

  • Super Contributor
  • ***
  • Posts: 14209
  • Country: de
Re: EEPROM corruption on AVR
« Reply #9 on: March 05, 2024, 12:12:03 pm »
The old AVRs without brown-out detection had a problem with the first location to get overwritten. Not using the first location already helped a lot and setting the brownout detector active pretty much solved the problem.

If there is space, one could use redundant memory. Saving the data 3 times and than do a majority vote is relatively easy, though it needs more memory.
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 495
  • Country: sk
Re: EEPROM corruption on AVR
« Reply #10 on: March 05, 2024, 01:03:22 pm »
> Saving the data 3 times and than do a majority vote

And what if I have 3 different values, because power outage happened in the middle of writing the second copy (i.e. there's one good old value, one good new value and one corrupted new value)?

JW
 

Offline mikerj

  • Super Contributor
  • ***
  • Posts: 3240
  • Country: gb
Re: EEPROM corruption on AVR
« Reply #11 on: March 05, 2024, 02:02:17 pm »
> Saving the data 3 times and than do a majority vote

And what if I have 3 different values, because power outage happened in the middle of writing the second copy (i.e. there's one good old value, one good new value and one corrupted new value)?

JW

Include a counter in your data structure (along with a CRC) and use the valid data with the highest count.  If you expect to write often enough that the counter overflows then add some logic to detect this e.g. if you have good data with a counter value of 255 and good data with a counter have of 0, then 0 is the most recent.
 
The following users thanked this post: horo

Online iMo

  • Super Contributor
  • ***
  • Posts: 4790
  • Country: pm
  • It's important to try new things..
Re: EEPROM corruption on AVR
« Reply #12 on: March 05, 2024, 02:07:10 pm »
The OP is not asking on a protection with the erroneous "writes". 
So he/she writes into 3 (or more) different places in the eeprom (that writes will be always ok, she/he assumes).
Then after, say 10 years, he/she reads the eeprom again, and makes a majority voting..



« Last Edit: March 05, 2024, 02:11:34 pm by iMo »
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16865
  • Country: lv
Re: EEPROM corruption on AVR
« Reply #13 on: March 05, 2024, 02:40:54 pm »
AFAIK accidental EEPROM overwrite during power cycle was a problem only in early AVR. Nonetheless enabling brown-out detection ensures than MCU does not incorrectly execute due to out of spec power.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 828
Re: EEPROM corruption on AVR
« Reply #14 on: March 05, 2024, 07:41:10 pm »
Quote
and somewhere else I've found that it's a bad practice to use the first address in EEPROM
Does not apply for these newer avr. The default 0 value (after reset) for the nvm.addr register is an invalid address for the nvm to use in all cases. The eeprom is at data space starting at 0x1400, and there are no addresses to avoid.
 

Offline Perkele

  • Regular Contributor
  • *
  • Posts: 50
  • Country: ie
Re: EEPROM corruption on AVR
« Reply #15 on: March 05, 2024, 08:51:11 pm »
Data mirroring with added checksums. At 16 or 20MHz, AVR is fast enough to calculate CRC-16 in a reasonable amount of time.
In addition to BOR, enable set-up time fuse (also called power-on reset timeout on other platforms) and set it to a maximum value. It is 64ms on most of AVRs.
A PSU becoming unstable after several years in operation can cause mayhem on power-on, especially if you're doing EEPROM access at that time.
In some applications, in addition to start-up delay, I might also add a simple wait loop if I need 100ms or 200ms of delay.
 
The following users thanked this post: SL4P

Offline OaklanderTopic starter

  • Regular Contributor
  • *
  • Posts: 56
  • Country: 00
Re: EEPROM corruption on AVR
« Reply #16 on: March 06, 2024, 10:03:45 am »
So if I enable brown-out detection and set-up time and remove any writes to the EEPROM from the code the EEPROM should be quite reliable.

In addition I could write the data in multiple locations and compare those when reading and select the majority. This could also be used to correct the corrupted entries but that would introduce write functionality into the code. So would that be bad after all if I actually want to remove any writes from the code.

If EEPROM corruption happens will the resulting data be random or are some changes more likely than others? When I discovered the corruption problem the data had turned into FF.
If it turns out the data usually turns to FF I should make the code ignore those entries while comparing reading the data.
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4078
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: EEPROM corruption on AVR
« Reply #17 on: March 06, 2024, 01:03:04 pm »
Some protocols, such as J1939, try no to use FF (and sometimes 00) as valid values. The reason is that 00 or FF are erased of undefined states of memory or a bus and accepting those increases the risk of bugs.
They often use an offset or range where 00 or FF would be invalid.

They also use special values, eg effective range is 0-250, while FB (251) to FF (255) have special meaning, such as Sensor Error (FE) and Unavailable (FF).
You could utilize a similar strategy.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 828
Re: EEPROM corruption on AVR
« Reply #18 on: March 06, 2024, 04:38:29 pm »
Quote
and remove any writes to the EEPROM from the code
Maybe describe in more detail what kind of data you want in eeprom, and how often this data needs to change. With writing removed from code as you state, it sounds like this is a one time config data write. If so, then the conversion takes a different direction.

Also note, these avr have an additional page of eeprom called user row (an additional 32 bytes in your tiny202) which does not erase with a chip erase (normal programming) or an eeprom erase (nvm command).
 

Offline OaklanderTopic starter

  • Regular Contributor
  • *
  • Posts: 56
  • Country: 00
Re: EEPROM corruption on AVR
« Reply #19 on: March 06, 2024, 07:46:08 pm »
Quote
and remove any writes to the EEPROM from the code
Maybe describe in more detail what kind of data you want in eeprom, and how often this data needs to change. With writing removed from code as you state, it sounds like this is a one time config data write. If so, then the conversion takes a different direction.

Also note, these avr have an additional page of eeprom called user row (an additional 32 bytes in your tiny202) which does not erase with a chip erase (normal programming) or an eeprom erase (nvm command).
I've already explained it before. The data will be one 8bit integer which will be programmed at the same time with the firmware and it must never change. It could be hard coded but there are dozens of different values for the setting so I'm using the EEPROM to prevent having so many different versions of the firmware.

The current solution is to send a command over serial bus to program the EEPROM once the firmware has been programmed. I could remove the code needed for that and use the programmer to program the EEPROM too.

I could use the user row but does it have other advantages over EEPROM than not being erased like you said?
« Last Edit: March 06, 2024, 07:48:19 pm by Oaklander »
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8173
  • Country: fi
Re: EEPROM corruption on AVR
« Reply #20 on: March 06, 2024, 07:52:43 pm »
Some protocols, such as J1939, try no to use FF (and sometimes 00) as valid values. The reason is that 00 or FF are erased of undefined states of memory or a bus and accepting those increases the risk of bugs.

Instead of trying to avoid some magical values which correlate with some types of problems, but miss all other sorts of corruption, one should just use checksums; CRC8 is pretty strong already for tens of bytes; CRC16 even better. If system availability is important, writing the whole checksummed block of data multiple times allows the code to try the next block if the CRC check of the first one fails.

And CRC8 is just a few lines of code, definitely simpler and more robust than some probably variable length escaping scheme.
 

Offline AndyBeez

  • Frequent Contributor
  • **
  • Posts: 856
  • Country: nu
Re: EEPROM corruption on AVR
« Reply #21 on: March 06, 2024, 08:46:20 pm »
Will it then reliably retain the data for years or even decades?
Will the capacitors, solder joints and other mechanical contacts last for decades? Will the user trash the circuit after five years? In the wild, AVR EEPROM data should exist far beyond the lifespan of a product. I built an ATTINY to run a vehicle's console lighting; the EEPROM data is still in there, almost two decades after it was flashed. The vehicle was turned into soup cans ages ago.

Yes, setting Brownout is good practice; remembering that Brownout detection is meant to initiate a graceful restart, deep sleep mode or low battery operation. When a brownout condition exists, nothing should be written and values read may not be reliable.

What value are you writing, how often are you updating it and, how often in the boot cycle and runtime reading it? If your write budget goes over the 100K limit in a couple of years, you'll need a different methodology. If you have to wait for the year 3000, then wear levelling will never be your problem.

The guys suggesting CRC are correct. You can also use shadow data where you read from multiple bytes and XOR the input with the previous input. If the result is not zero, you got bad data. You do not need too many clock cycles to implement this check in assembler.

What you then do if the data is corrupted  :-//
 

Offline OaklanderTopic starter

  • Regular Contributor
  • *
  • Posts: 56
  • Country: 00
Re: EEPROM corruption on AVR
« Reply #22 on: March 06, 2024, 09:29:14 pm »
writing the whole checksummed block of data multiple times allows the code to try the next block if the CRC check of the first one fails
That's what I will do. It's much simpler than the majority method discussed earlier.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 828
Re: EEPROM corruption on AVR
« Reply #23 on: March 07, 2024, 03:16:54 am »
Quote
The data will be one 8bit integer which will be programmed at the same time with the firmware and it must never change
If your programmer could do the job, then just let it. Your programmer will have verified the value at programming time so no need to add any special code to get this value in the mcu, and no need to verify its value on your own. Now all you need to do is decide where you want this value- eeprom, flash, user row.

In all cases, you just have to decide what address this byte will be at- if in flash, you can use the last byte in flash (known location that will not change), for eeprom or userrow it could be the first address of either (0x1400 or 0x1300).

There are various ways to 'insert' this value into the hex file- can do it 'manually' by inserting a hex record or the programmer may have a way to add unique data at time of programming.

example for flash, storing the byte at end of flash (where, unless you are using every available byte for code, will be free)-
#define MY_SPECIAL_VALUE (*(uint8_t*)(MAPPED_PROGMEM_END)) //note mapped address used, so can read flash directly

and any use will read
if( MY_SPECIAL_VALUE > 10 ) { /* do something */ }

and this special value can be easily added to a hex file, which then will program that value at the end of flash.
 

Online Psi

  • Super Contributor
  • ***
  • Posts: 9953
  • Country: nz
Re: EEPROM corruption on AVR
« Reply #24 on: March 07, 2024, 09:54:31 am »
Just thinking outside the box a bit. Might not be useful..

If you have a page of flash memory spare/unused, you could store your EEPROM write functions in that flash page at a fixed address. Then you can use those functions to write anything you want to eeprom using seral commands. Then when you have the system all setup you issue the command to do a flash erase on just that page to permanently remove all flash write functions from the firmware on the device.

You'd also want to set a flag in EEPROM before page erase that says the functions no longer exist at that address. Just so you can block trying to run functions that no longer exist, but that is easy.
« Last Edit: March 07, 2024, 10:04:25 am by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf