Sorry; didn't mean to say it is a bug in GCC :)
It is a system built with GCC ARM 32F4 :)
The file comes from a FAT file system, originally, implemented on the target via USB. But that aspect of the project has been working for 1+ year, until a few weeks ago. Recently it has grown in size (by replacing PolarSSL with MbedTLS) from 170k to 300k and this issue has appeared. My code should support a file up to 510k (size of the entire binary) so that's not the simple answer, but I can see I may be dropping something off the end..
Disassembly:
(https://peter-ftp.co.uk/screenshots/202112013313794122.jpg)
I can see the two arrays are initialised to zero with explicit instructions (r5=0 etc). But I can't see how the MonTue... bit gets initialised.
In the linkfile, it does indeed look like rodata is loaded after all the code
.main.o :
{
. = ALIGN(4);
KEEP(*(.main.o))
*main.o (.text .text* .rodata .rodata*)
. = ALIGN(4);
} >FLASH_APP
/* This collects all other stuff, which gets loaded into FLASH after main.o above */
.text :
{
. = ALIGN(4);
*(.text) /* .text sections (code) */
*(.text*) /* .text* sections (code) */
*(.rodata) /* .rodata sections (constants, strings, etc.) */
*(.rodata*) /* .rodata* sections (constants, strings, etc.) */
*(.glue_7) /* glue arm to thumb code */
*(.glue_7t) /* glue thumb to arm code */
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .; /* define a global symbol at end of code */
} >FLASH_APP
Thank you for any pointers, so to speak.
Spot the mistake :)
for (uint32_t block32k=1; block32k<16; block32k++) // 1-15
{
// Read each 32k block into buffer1
buffer1idx=0;
for ( uint32_t page=pagebase; page<(pagebase+64); page++ )
{
AT45dbxx_ReadPage(&buffer1[buffer1idx],512,page*512);
buffer1idx+=512;
}
// Program 32k block
L_HAL_FLASH_Unlock();
for (uint32_t i=0; i<(32*1024); i+=4)
{
uint32_t data=buffer1[i]|(buffer1[i+1]<<8)|(buffer1[i+2]<<16)|(buffer1[i+3]<<24);
L_FLASH_Program_Word(i+cpubase, data);
}
L_HAL_FLASH_Lock();
// Verify 32k block against buffer1
for (uint32_t i=0; i<(32*1024); i+=4)
{
uint32_t data=buffer1[i]|(buffer1[i+1]<<8)|(buffer1[i+2]<<16)|(buffer1[i+3]<<24);
if ( (*(volatile uint32_t*) (i+cpubase)) != data ) error++;
}
cpubase+=(32*1024);
pagebase+=64;
block32k++;
}
Two lots of fossil code there. One is the block32k++. The other is the fact that black32k is not used within the loop. That is why everything worked for binary files up to 8 blocks (256k) or so.