I'm having trouble with writing to flash on an STM32F767ZI. Our firmware (rusEfi open source ECU -
github.com/rusefi/rusefi) writes two copies of the configuration to flash (in different pages) in case power is lost while writing one, at least we have the old one, with a CRC to check that the whole thing got written before power was lost. The problem is that occasionally (maybe 1 in 4 writes), some of the words in one or both of the copies will not get written, and stay stuck in their erased state (0xFFFFFFFF).
Here's an example hex dump, pulled off the chip with an stlink.
The intact sections are intact, and the busted sections are busted.
Observations:When this happens, the PGPERR (programming parallelism error - trying to write the wrong width to flash) flag gets set in the flash status register. That's a flag I'd expect to go off every time if there was in fact a width mismatch, but it only happens occasionally!
The chip is running at 3.3v, and PSIZE is set to 2 (32-bit programming), which is the correct setting according to the datasheet. Interestingly if I set PSIZE to 0 (8-bit programming), the problem seems to go away (though I haven't tried a statistically significant number of times).
I've tried it on two different boards, and both have the same (intermittent) behavior.
edit: I have no problem reading/writing/erasing flash using an stlink. Only have problems when doing it from the firmware itself.
Whatever is going on, it's pretty intermittent. While debugging it I saw error free writes, writes with a single word wrong (usually in the first few words), and writes where EVERY word failed to write.
A suspicion:In the dump above, the failures come in blocks of 32 bytes, aligned on 32 byte boundaries. On the Cortex-M7, that's the size of a cache line. However, it doesn't always happen in 32-byte chunks.
Anybody have a clue what's going on here?