Author Topic: How to create ELF file which contains the normal prog, plus a relocatable block?  (Read 12613 times)

0 Members and 2 Guests are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
I am using ST Cube IDE (32F4) and trying to generate a binary image which comprises of my program, plus a 16k block which is linked to run from the start of main RAM.

It would be copied to RAM by a copy loop in the startup .s file - in the same way as data is initialised, bss is zeroed, etc.

A lot of people have been up this path but I wonder what is the best way to achieve this.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online gf

  • Super Contributor
  • ***
  • Posts: 1173
  • Country: de
I don't know your linker script.
Assuming that .data is already the first section in RAM, I'd include the .o file containing the 16k block at the begin of the .data secion (in the linker script).
As the .data secion is placed in ROM and copied to RAM ayway, there should be no need for an extra copy then.

If the 16k block is not a *.o file, but a raw binary file, then it needs to be converted to a .o file in the first place, e.g. with objcopy.
[ In this case it also won't contain exported symbols which can be directly referenced from from your main program. So it is up to you then, how to find the entry points (functions) in the code blob. ]

See also
https://stackoverflow.com/questions/17265950/linking-arbitrary-data-using-gcc-arm-toolchain
https://stackoverflow.com/questions/327609/include-binary-file-with-gnu-ld-linker-script/328137
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
"As the .data secion is placed in ROM and copied to RAM ayway, there should be no need for an extra copy then."

That's a cunning solution :)

However, I don't want to waste 16k of RAM. There is just 128k, plus 64k CCM RAM which is already reserved for stuff. The plan is to use the 16k RAM code for code which can't execute from CPU FLASH: Conditionally (having checked some CRC etc), copy a 1MB portion of an SPI FLASH chip to the CPU FLASH (IOW, program the entire CPU from the SPI FLASH).

And that 16k of RAM would then be "dumped", next time the thing boots up.

Linker script:

Code: [Select]
/* Entry Point */
ENTRY(Reset_Handler)

/* Highest address of the main stack */
 /*  _estack = 0x20020000;  */    /* stack in 128K RAM */
 _estack = 0x10010000;    /* stack in  64k CCM - note: configTOTAL_HEAP_SIZE + min_stack_size must not exceed 64k) */
 
/* top of RAM for _sbrk - top of heap check */
 _top = 0x20020000;

/* Heap and stack sizes */
_Min_Heap_Size  = 0xa000;     /* 40k heap - min size; it can grow to end of main RAM  */
_Min_Stack_Size = 0x4000;     /* 16k stack - in CCM */

/* Specify the memory areas */
/* CCMRAM added PH 12/5/2021 - cannot use with DMA */
MEMORY
{
  FLASH (rx)      : ORIGIN = 0x08000000, LENGTH = 1024K
  RAM (xrw)       : ORIGIN = 0x20000000, LENGTH = 128K
  MEMORY_B1 (rx)  : ORIGIN = 0x60000000, LENGTH = 0K
  CCMRAM (rw)     : ORIGIN = 0x10000000, LENGTH = 64K
}

/* Define output sections */
SECTIONS
{
  /* The startup code goes first into FLASH */
  .isr_vector :
  {
    . = ALIGN(4);
    KEEP(*(.isr_vector)) /* Startup code */
    . = ALIGN(4);
  } >FLASH

  /* The program code and other data goes into FLASH */
  .text :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _etext = .;        /* define a global symbols at end of code */
  } >FLASH


   .ARM.extab   : { *(.ARM.extab* .gnu.linkonce.armextab.*) } >FLASH
    .ARM : {
    __exidx_start = .;
      *(.ARM.exidx*)
      __exidx_end = .;
    } >FLASH

  .preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array*))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  } >FLASH
  .init_array :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT(.init_array.*)))
    KEEP (*(.init_array*))
    PROVIDE_HIDDEN (__init_array_end = .);
  } >FLASH
  .fini_array :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(.fini_array*))
    KEEP (*(SORT(.fini_array.*)))
    PROVIDE_HIDDEN (__fini_array_end = .);
  } >FLASH

  /* used by the startup to initialize data */
  _sidata = .;

  /* Initialized data sections goes into RAM, load LMA copy after code */
  .data : AT ( _sidata )
  {
    . = ALIGN(4);
    _sdata = .;        /* create a global symbol at data start */
    *(.data)           /* .data sections */
    *(.data*)          /* .data* sections */

    . = ALIGN(4);
    _edata = .;        /* define a global symbol at data end */
  } >RAM

  /* Uninitialized data section */
  . = ALIGN(4);
  .bss :
  {
    /* This is used by the startup in order to initialize the .bss secion */
    _sbss = .;         /* define a global symbol at bss start */
    __bss_start__ = _sbss;
    *(.bss)
    *(.bss*)
    *(COMMON)

    . = ALIGN(4);
    _ebss = .;         /* define a global symbol at bss end */
    __bss_end__ = _ebss;
  } >RAM

  /* The heap ends up after BSS in main RAM */
  /* This also checks that the top of the heap doesn't hit the bottom of the stack i.e. how much RAM left */
  /* User_heap_stack section, used to check that there is enough RAM left */
 
  /* ._user_heap_stack : */
  .main_heap :
  {
    . = ALIGN(8);
    PROVIDE ( end = . );
    PROVIDE ( _end = . );
    . = . + _Min_Heap_Size;
 /*   . = . + _Min_Stack_Size; */  /* PH 14/5/2021 stack is in CCM, not here */
    . = ALIGN(8);
   } >RAM
  /*   } >CCMRAM */


  /* MEMORY_bank1 section, code must be located here explicitly            */
  /* Example: extern int foo(void) __attribute__ ((section (".mb1text"))); */
  /* Not used 14/5/2021 - was apparently used for LCD display on ST dev kit */
  .memory_b1_text :
  {
    *(.mb1text)        /* .mb1text sections (code) */
    *(.mb1text*)       /* .mb1text* sections (code)  */
    *(.mb1rodata)      /* read-only data (constants) */
    *(.mb1rodata*)
  } >MEMORY_B1
 
  /* CCM-RAM section
  *
  * IMPORTANT NOTE!
  * If variables placed in this section must be zero initialized,
  * the startup code needs to be modified to initialize this section. 
  * Done PH 12/5/2021
  */
  .ccmram :
  {
    . = ALIGN(4);
    _sccmram = .;       /* create a global symbol at ccmram start */
    *(.ccmram)
    *(.ccmram*)
   
    . = ALIGN(4);
    _eccmram = .;       /* create a global symbol at ccmram end */
  } >CCMRAM

  /* Remove information from the standard libraries */
  /DISCARD/ :
  {
    libc.a ( * )
    libm.a ( * )
    libgcc.a ( * )
  }

  .ARM.attributes 0 : { *(.ARM.attributes) }
}


However, I don't just want to load a chunk of binary data into RAM. That is easy. One could even have a .c file with an initialised uchar array in it; that will automatically end up in initialised data. The challenge is how to implement in effect

ORG 0x20000000
C code goes here

and have this within the same project.

It is obviously possible to generate code located to run from address 0x20000000 if one sets up a separate Cube IDE project and uses that just to generate this boot loader. But then one ends up with two projects to maintain, and switch between them.
« Last Edit: July 12, 2021, 09:19:45 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online gf

  • Super Contributor
  • ***
  • Posts: 1173
  • Country: de
Quote
ORG 0x20000000
C code goes here

I think something like

Code: [Select]
SECTIONS {
  ...
  myccode 0x20000000 :
    {
      . = ALIGN(4);
      _smyccode = .;   
      myccode.o
      . = ALIGN(4);
      _emyccode = .;   
    } AT>FLASH
  ...
}

Edit: But keep in mind that your code and .data cannot reside simultaneously at 0x20000000...
Once you copy your code to 0x20000000, .data gets overwritten, so you must ensure that anything from .data is not accessed any more.
« Last Edit: July 12, 2021, 10:36:12 pm by gf »
 

Offline bugnate

  • Regular Contributor
  • *
  • Posts: 58
  • Country: us
It is obviously possible to generate code located to run from address 0x20000000 if one sets up a separate Cube IDE project and uses that just to generate this boot loader. But then one ends up with two projects to maintain, and switch between them.

I had a similar conundrum some years back for a F103 bootloader (copied itself to SRAM and executed from there so that it could overwrite its own flash space, for reasons). In the end I was not able to get a truly satisfying solution but I was able to keep it to a single project by doing the gymnastics in the Makefile (and conditionals in the Linker file) instead of a totally separate project.
« Last Edit: July 13, 2021, 02:00:32 am by bugnate »
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6264
  • Country: fi
    • My home page and email address
I personally would treat the bootloader (or whatever that is that runs from RAM) as a separate "process": set up a stack, copy it to RAM, and call it.  If it ever returns, assume hardware state is still the same, and continue by resetting the stack, copy initialized RAM from Flash, clear the rest of RAM, call constructors, and finally call main(), as normal.

In other words, implement it as a duplicate environment run prior to the proper one.  No shared resources at all.  (Passing data is difficult, but possible: keep it at the middle of the stack, don't let it be cleared, and use a constructor function that calls the library malloc() function, and copies the passed data to that –– assuming you don't want to keep the RAM reserved for that data until shutdown.)

I don't use ST Cube IDE, but it should not need much meddling in the CRT (assembly run prior to main()) and linker files.  I can probably point out the changes if you provide the linker file and the .s source, though.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
At one level I think the answer must be simple.

For example, to locate an an array in the CCM memory, you use

Code: [Select]
#define CCMRAM __attribute__((section(".ccmram")))

// RTOS buffer, used for process stacks etc
CCMRAM uint8_t ucHeap[configTOTAL_HEAP_SIZE];

and then in the linker script (posted above) you have corresponding commands to make it work.

The same method should work for code ("text").

What you won't get, unless you do additional aerobatics, is "data" i.e. initialised data, so e.g.

int x=0;

will not work. You have to use

int x;
x=0;

This is easy to remember. The challenge will be if using libraries written by others, because loads of people do stuff like
for (int x=0; x<10; x++)

It may not work at all if a library is provided as binary. And even if provided as source (e.g. the ST libs) you have to go through it line by line. For example here you see a line right away, and there are plenty more in that HAL function (which is full of crap anyway like all the ST libs)

Code: [Select]
// Send and receive 1 byte
// Returns received byte or zero on failure
static uint8_t at45dbxx_tx_rx_byte(uint8_t data)
{
uint8_t ret = 0;
HAL_SPI_TransmitReceive(&_45DBXX_SPI, &data, &ret, 1, AT45DB_SPI_TIMEOUT_MS);
return ret;
}

If you want to avoid having to go through the code, you have to figure out how to set up a "data" section as well.

In reality this piece of loader code would be used for a specific simple task only: to write the CPU FLASH. In this case the data is read from a serial (SPI) FLASH chip. The code for that is all done and would need to be stripped down / checked / edited to make sure no initialised data section is needed. Then I need to find code to write the CPU FLASH and I am sure many have been up that path.

The two stackoverflow links above are fine if you generate the code using a separate project, but as I say that is basically simple because you could achieve it with a little program which takes in a binary block, or even an .elf file, and generates C source of the form

uint8_t buf[]  = { 0xb5, 0x62, 0x01, 0x35, 0xb5, 0x62, 0x01, 0x35, 0xb5, 0x62, 0x01, 0x35, 0xb5, 0x62, 0x01, 0x35 };

and you just drop that into your C source.

Is there a directive one can use in a C sourcefile to direct initialised data into a section of another name, e.g. "data2"? If that is possible, then it is just another simple bit of asm code in the startup:

Code: [Select]
.syntax unified
  .cpu cortex-m4
  .fpu softvfp
  .thumb

.global  g_pfnVectors
.global  Default_Handler

/* start address for the initialization values of the .data section.
defined in linker script */
.word  _sidata
/* start address for the .data section. defined in linker script */ 
.word  _sdata
/* end address for the .data section. defined in linker script */
.word  _edata
/* start address for the .bss section. defined in linker script */
.word  _sbss
/* end address for the .bss section. defined in linker script */
.word  _ebss
/* stack used for SystemInit_ExtMemCtl; always internal RAM used */

/**
 * @brief  This is the code that gets called when the processor first
 *          starts execution following a reset event. Only the absolutely
 *          necessary set is performed, after which the application
 *          supplied main() routine is called.
 * @param  None
 * @retval : None
*/

    .section  .text.Reset_Handler
  .weak  Reset_Handler
  .type  Reset_Handler, %function
Reset_Handler: 

  ldr   sp, =_estack

/* Copy the data segment initializers from flash to SRAM */ 
  movs  r1, #0
  b  LoopCopyDataInit

CopyDataInit:
  ldr  r3, =_sidata
  ldr  r3, [r3, r1]
  str  r3, [r0, r1]
  adds  r1, r1, #4
   
LoopCopyDataInit:
  ldr  r0, =_sdata
  ldr  r3, =_edata
  adds  r2, r0, r1
  cmp  r2, r3
  bcc  CopyDataInit
  ldr  r2, =_sbss
  b  LoopFillZerobss
/* Zero fill the bss segment. */ 
FillZerobss:
  movs  r3, #0
  str  r3, [r2], #4
   
LoopFillZerobss:
  ldr  r3, = _ebss
  cmp  r2, r3
  bcc  FillZerobss

/* Zero CCM RAM - fills the whole CCM so don't use the stack until afterwards :) */
/* PH 15/5/2021 */
ldr r2, = 0x10000000  /* was _sccmram */
b LoopFillZeroCcm

FillZeroCcm:
movs r3, 0xaaaaaaaa /* this fill intentionally differs from the a5a5a5a5 fill used by FreeRTOS for its stacks */
  str  r3, [r2]
adds r2, r2, #4

LoopFillZeroCcm:
ldr r3, = 0x10010000  /* was _eccmram */
cmp r2, r3
bcc FillZeroCcm

/* Call the clock system initialization function.*/
  bl  SystemInit   
/* Call static constructors */
    bl __libc_init_array
/* Call the application's entry point.*/
  bl  main
  bx  lr   
.size  Reset_Handler, .-Reset_Handler
« Last Edit: July 13, 2021, 08:10:52 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14476
  • Country: fr
I was going to suggest looking at the '-fpic' GCC option, which is supposed to generated relocatable code. I think there are ARM-specific options complementing -fpic.
It can apparently make the code significantly larger than without, though.

If you don't have the compiler specifically generate relocatable code, it may not work if you load it at a location different from where it was defined to be at link-time. In particular, function calls within the module that you want to be relocatable, except functions that are inlined of course, or that could (by chance) be called with relative jumps, but that's not guaranteed if you don't use '-fpic'.

A simpler approach is of course to define a dedicated section in the linker script (in the RAM area you want to run your code from), and use attributes in your code to put it in the right section.
If I got it right, you were concerned that this approach would require reserving a RAM block large enough for this, that would be otherwise not accessible for other needs if said code is not loaded or not used? Well, the way I see it, the only way you could reuse this RAM block would be through dynamic allocation anyway. So you can arrange your linker script to make this area available for heap allocation as well. That may additionally require to write your own _sbrk() function, though.
 
The following users thanked this post: harerod

Online gf

  • Super Contributor
  • ***
  • Posts: 1173
  • Country: de
Quote from: peter-h
What you won't get, unless you do additional aerobatics, is "data" i.e. initialised data, so e.g.
int x=0;
will not work

Why do you think that you need to handle the (initialized) .data section of your RAM program different than .text? I don't see the difference. When you activate the RAM program, you need to copy the contents of both, .text and .data, from flash to RAM. You can even combine both input sections, .text and .data (and not to forget several others), into a single output section in the linker script, and copy them as a single chunk from flash to RAM upon activation. Different is rather .bss, because you likely don't want to allocate (waste) zero-filled space for it in flash, but rather zero it at runtime.

If you don't have the compiler specifically generate relocatable code, it may not work if you load it at a location different from where it was defined to be at link-time. In particular, function calls within the module that you want to be relocatable, except functions that are inlined of course, or that could (by chance) be called with relative jumps, but that's not guaranteed if you don't use '-fpic'.

As far as I understand, it is supposed to run at a fixed address anyway. In the linker script, you can specify both, the VMA (virtual memory address) and the LMA (load address) for an output section. VMA is the address where the section is supposed to reside at runtime, and LMA is the address, where the section contents are placed by linker (e.g. in flash, before they are copied to the VMA at runtime). Symbols are relocated by the linker according to the VMA.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14476
  • Country: fr
In the linker script, you can specify both, the VMA (virtual memory address) and the LMA (load address) for an output section. VMA is the address where the section is supposed to reside at runtime, and LMA is the address, where the section contents are placed by linker (e.g. in flash, before they are copied to the VMA at runtime). Symbols are relocated by the linker according to the VMA.

I'll have to look into this. But as far as I understand, that will work only if the locations are fixed (which is probably good enough here for the OP.)

For fully relocatable code that you want to be able to run from anywhere in memory, you'll still need the '-fpic' option. That's probably not useful on a small embedded target though. More so if you're writing an 'OS' that can load code anywhere in RAM.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Relocatable code would be super neat.

"A simpler approach is of course to define a dedicated section in the linker script (in the RAM area you want to run your code from), and use attributes in your code to put it in the right section."

That is how CCM RAM is implemented - see my linker script above. It is easy.

For code, it may be something like

section(".text")

where "text" is the standard section name for executable code. But then one also needs a compiler/linker directive to compile the code to execute from 0x20000000.

This is used in the startup code; a slightly different format since it is assembler



"If I got it right, you were concerned that this approach would require reserving a RAM block large enough for this"

I don't think RAM usage matters, because this code will get copied (by startup) to the base of RAM, 0x20000000, it will do its job, and then will jump to the original startup code, probably Reset_Handler: above. So it doesn't matter if it uses up all 128k of RAM... in fact it may as well use as much as it can because that will speed up FLASH programming.

But the code won't be that big, because it just needs to read the serial FLASH and, subject to CRC check etc, write the CPU FLASH with it. Well, all except a block of CPU FLASH containing the code in question (to prevent bricking the unit if e.g. power was lost during programming). The code for the serial FLASH is all running (in the main project) and even being ex ST HAL bloatware it isn't more than a few k.

One gotcha, apparently, is that the uppermost CPU FLASH is a 128k block, which needs to be erased all at once, so the loader cannot go in there (if you want a totally brick-proof box) :) Well, it can.. this is a secondary issue.

"Why do you think that you need to handle the (initialized) .data section of your RAM program different than .text? I don't see the difference. When you activate the RAM program, you need to copy the contents of both, .text and .data, from flash to RAM. You can even combine both input sections, .text and .data (and not to forget several others), into a single output section in the linker script, and copy them as a single chunk from flash to RAM upon activation."

I probably need an actual example... From above, I reckon

section(".text2")

will place all subsequent C code into "text2" and then the linker can process that, and the startup code can copy a 16k block from that address to 0x20000000.

But how does one do that with initialised data? Normally, if you write
uint32_t i=5;
then the compiler automatically puts a 32 bit value with 0x00000005 in it into a section called "data". How does one override that in GCC, just for the current source file?

I've done a lot of googling and can't find much. There is a lot of little bits around e.g. execution from RAM prevents caching, so the code runs a bit slower according to some reports, but that's irrelevant.


« Last Edit: July 13, 2021, 07:34:52 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14476
  • Country: fr
But how does one do that with initialised data? Normally, if you write
uint32_t i=5;
then the compiler automatically puts a 32 bit value with 0x00000005 in it into a section called "data". How does one override that in GCC, just for the current source file?

What exactly would you like to do?
Put that in another section?
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
The standard startup code contains this



and that needs to be implemented for the block of code which is getting copied into RAM.

Otherwise, that stuff will end up in the FLASH and remain there, but the FLASH is going to be gradually overwritten by this loader. And in any case the FLASH is not accessible during the programming cycle (the CPU gets a few million wait states).

Maybe there is a way to combine initialised data with the code. That should work, given that the whole lot is going to be copied into RAM, and it won't need the above copy loop either. I have seen references to "RAM" compiler directives; maybe that puts initialised data into the code segment (normally "text" but here we will use a different name).
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online gf

  • Super Contributor
  • ***
  • Posts: 1173
  • Country: de
Maybe there is a way to combine initialised data with the code. That should work, given that the whole lot is going to be copied into RAM,

That's what I already suggested and where I said that I don't see a difference between the handling of .text and .data.

Quote
I have seen references to "RAM" compiler directives; maybe that puts initialised data into the code segment (normally "text" but here we will use a different name).

Input sections can be flexibly re-arranged, combined and placed at the desired VMA and LMA in the linker script, so I don't think that many custom annotations in the C file are necessary (if at all).
I suggest to study the ld coumentation in deep detail, in order to understand linker scripts and the linker's capabilities.

I rather wonder: Is your "RAM program" self-contained (in the sense of not requiring linking with any external libraries)?
Since you want to overwrite flash, your program must not call any library functions which are placed by the linker somewhere in flash.
And vice versa, your program must not define and export any (library) symbols which happen to be referenced from any code residing in flash (except for the entry point).
If this is not granted, a crash is inevitable.

If you cannot renounce the use of external libraries, then I guess that you may need to pre-link your "RAM program" with the required libs, and strip it, in order to obtain a self-contained .o file w/o external symbol references, before linking the latter into the image.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Yes it will be totally self contained.

I am working on it now.

Turns out that within the C code you can put a function into a particular section (e.g. "code_in_ram" by declaring it with:

attribute ((long_call, section (".code_in_ram"))) void myfunction(void)                           
{                                                                               
}

this is the same feature as I posted above, to define variables in the CCRAM. For functions you also need the long_call attribute so that it uses a long address to call code in another section.

Still not sure what happens with initialised data. However, it's just been pointed out to me that any instance within a function is fine because the compiler inserts code to do that. It is only initilised data outside any function which needs this, and I can avoid that. I knew that but forgot :)
« Last Edit: July 14, 2021, 07:40:58 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4037
  • Country: nz
Putting long_call there seems archaic. Why not let the compiler/linker figure it out?  You can use -mlong_calls in the compiler switches to make the long/short call decision automatic. If the flash and RAM addresses are less than 64 MB apart then it's unnecessary anyway.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Let's say I have a file, say loader.c, and in it is a collection of functions, one of which is loader().

Actually loader() is at the very end of the file (easier, since all supporting functions in it are static and w/o prototypes).

What is the simplest way to copy that code, say 16k regardless of how much smaller than 16k it actually is, to some address in RAM, and then jump to it?

I want to do it from a statement at the start of loader(). IOW, from the C source file.

Presumably it would use memcpy.

IOW, not have the flash-> RAM copying code in startupxxx.s, which is what everybody else is doing. The reason for this is that I have a load of code in main() which sets up I/O, peripherals, etc, and loader() will get called from main() before interrupts are enabled. If the loader is activated from startup, it needs to contain duplicates of all the I/O etc config, which is daft and looking for trouble later.

I can think of truly horrible ways to find the start address of the block of code, e.g. by declaring a unique string inside a function at the start of it, and searching the flash for it ;)

When loader() is finished it will never return to main(); it will reboot the CPU, and (assuming the flashing operation completed ok) it won't get run again, so it doesn't matter how much mess it makes in the RAM. The SP is set to the top of CCM in startup, which is miles away.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online gf

  • Super Contributor
  • ***
  • Posts: 1173
  • Country: de
Below is how I guess that it may work.


Here is a dummy loader.c, containing code, initialized and uninitialized data. Does not do anything usefuls - just for test of section placement.
Compile with: arm-none-eabi-gcc -c loader.c (don't optimize this dummy loader, otherwise the optimizer eliminated .data and .bss segments)
Eventually replace this with the actual loader.

Code: [Select]
static const char* pp = "abcdef";
static char s[] = "sadfasfasdf";
static char buffer[1024];

void loader(void)
{
    const char *p1 = pp;
    const char *p2 = s;
    const char *p3 = buffer;
}


Suggested modification for your linker script, for placing loader.o at proper VMA and LMA.

Code: [Select]
/* Entry Point */
ENTRY(Reset_Handler)

/* Reference loader to ensure that is gets linked-in */
EXTERN(loader)

/* Highest address of the main stack */
 /*  _estack = 0x20020000;  */    /* stack in 128K RAM */
 _estack = 0x10010000;    /* stack in  64k CCM - note: configTOTAL_HEAP_SIZE + min_stack_size must not exceed 64k) */
 
/* top of RAM for _sbrk - top of heap check */
 _top = 0x20020000;

/* Heap and stack sizes */
_Min_Heap_Size  = 0xa000;     /* 40k heap - min size; it can grow to end of main RAM  */
_Min_Stack_Size = 0x4000;     /* 16k stack - in CCM */

/* Specify the memory areas */
/* CCMRAM added PH 12/5/2021 - cannot use with DMA */
MEMORY
{
  FLASH (rx)      : ORIGIN = 0x08000000, LENGTH = 1024K
  RAM (xrw)       : ORIGIN = 0x20000000, LENGTH = 128K
  MEMORY_B1 (rx)  : ORIGIN = 0x60000000, LENGTH = 0K
  CCMRAM (rw)     : ORIGIN = 0x10000000, LENGTH = 64K
}

/* Define output sections */
SECTIONS
{
 /* loader_bss and loader sections must come first, in order to override the wildcards in subsequent sections */
 loader_bss _loader_end : {
  _loader_bss_start = .;
  loader.o(.bss*)
  . = ALIGN(4);
  _loader_bss_end = .;
 }
 loader 0x20000000 : AT(_loader_loadaddr) {
  _loader_start = .;
  loader.o
  . = ALIGN(4);
  _loader_end = .;
 }
 _loader_size = SIZEOF(loader);

  /* The startup code goes first into FLASH */
  .isr_vector :
  {
    . = ALIGN(4);
    KEEP(*(.isr_vector)) /* Startup code */
    . = ALIGN(4);
  } >FLASH

  /* The program code and other data goes into FLASH */
  .text :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _etext = .;        /* define a global symbols at end of code */
  } >FLASH


   .ARM.extab   : { *(.ARM.extab* .gnu.linkonce.armextab.*) } >FLASH
    .ARM : {
    __exidx_start = .;
      *(.ARM.exidx*)
      __exidx_end = .;
    } >FLASH

  .preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array*))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  } >FLASH
  .init_array :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT(.init_array.*)))
    KEEP (*(.init_array*))
    PROVIDE_HIDDEN (__init_array_end = .);
  } >FLASH
  .fini_array :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(.fini_array*))
    KEEP (*(SORT(.fini_array.*)))
    PROVIDE_HIDDEN (__fini_array_end = .);
  } >FLASH

  /* Initialized data sections goes into RAM, load LMA copy after code */
  .data :
  {
    /* . = ALIGN(4); */
    _sdata = .;        /* create a global symbol at data start */
    *(.data)           /* .data sections */
    *(.data*)          /* .data* sections */
    . = ALIGN(4);
    _edata = .;        /* define a global symbol at data end */
  } >RAM AT >FLASH

  /* used by the startup to initialize data */
  _sidata = LOADADDR(.data);

 /* dummy placeholder in flash for loader section, to count flash usage */
 .loader : {
  . = . + SIZEOF(loader);
 } AT >FLASH
 _loader_loadaddr = LOADADDR(.loader);

  /* Uninitialized data section */
  . = ALIGN(4);
  .bss :
  {
    /* This is used by the startup in order to initialize the .bss secion */
    _sbss = .;         /* define a global symbol at bss start */
    __bss_start__ = _sbss;
    *(.bss)
    *(.bss*)
    *(COMMON)
    . = ALIGN(4);
    _ebss = .;         /* define a global symbol at bss end */
    __bss_end__ = _ebss;
  } >RAM

  /* The heap ends up after BSS in main RAM */
  /* This also checks that the top of the heap doesn't hit the bottom of the stack i.e. how much RAM left */
  /* User_heap_stack section, used to check that there is enough RAM left */
 
  /* ._user_heap_stack : */
  .main_heap :
  {
    . = ALIGN(8);
    PROVIDE ( end = . );
    PROVIDE ( _end = . );
    . = . + _Min_Heap_Size;
 /*   . = . + _Min_Stack_Size; */  /* PH 14/5/2021 stack is in CCM, not here */
    . = ALIGN(8);
   } >RAM
  /*   } >CCMRAM */

  /* MEMORY_bank1 section, code must be located here explicitly            */
  /* Example: extern int foo(void) __attribute__ ((section (".mb1text"))); */
  /* Not used 14/5/2021 - was apparently used for LCD display on ST dev kit */
  .memory_b1_text :
  {
    *(.mb1text)        /* .mb1text sections (code) */
    *(.mb1text*)       /* .mb1text* sections (code)  */
    *(.mb1rodata)      /* read-only data (constants) */
    *(.mb1rodata*)
  } >MEMORY_B1
 
  /* CCM-RAM section
  *
  * IMPORTANT NOTE!
  * If variables placed in this section must be zero initialized,
  * the startup code needs to be modified to initialize this section.
  * Done PH 12/5/2021
  */
  .ccmram :
  {
    . = ALIGN(4);
    _sccmram = .;       /* create a global symbol at ccmram start */
    *(.ccmram)
    *(.ccmram*)
   
    . = ALIGN(4);
    _eccmram = .;       /* create a global symbol at ccmram end */
  } >CCMRAM

  /* Remove information from the standard libraries */
  /DISCARD/ :
  {
    libc.a ( * )
    libm.a ( * )
    libgcc.a ( * )
  }

  .ARM.attributes 0 : { *(.ARM.attributes) }
}


And that's how I think it could be invoked from your main():

Code: [Select]
extern char _loader_start;
extern char _loader_end;
extern char _loader_loadaddr;
extern char _loader_bss_start;
extern char _loader_bss_end;
extern void loader();

void main()
{
    /* ensure that interrupts are disabled here */

    /* copy loader to ram */
    memcpy(&_loader_start, &_loader_loadaddr, &_loader_end - &_loader_start);

    /* clear loader's bss */
    memset(&_loader_bss_start, 0, &_loader_bss_end - &_loader_bss_start);

    /* keep interrupts disabled in loader(), since RAM is already overwritten now, and
       since you intend to overwrite the vectors and ISRs in flash from loader(). */
    loader();
}

 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14476
  • Country: fr
Still not sure what happens with initialised data. However, it's just been pointed out to me that any instance within a function is fine because the compiler inserts code to do that. It is only initilised data outside any function which needs this, and I can avoid that. I knew that but forgot :)

For initialized data, the thing to consider is this: in the linker script, the (usually) '.data' section is a particular one, as it requires two separate locations. One is the memory area where the variables themselves will be stored (usually somewhere in RAM), and the other for the initializers, which will end up somewhere else (usually in Flash memory for typical microcontrollers.)

This is defined this way:
Code: [Select]
.data :
    {
    } >RAM AT>FLASH

(assuming RAM is the memory area defined for RAM, and FLASH the one defined for Flash memory.)

If you don't mind your variables to all end up in the same section, then you have nothing special to do.
You'll put, as you mentioned, the code in its own section (separate from '.text'), and you can leave variables declarations as is without doing anything. The only problem with this approach is that variables for your alternate piece of code will be allocated with all the other variables, and thus will take up static space at all times. If variables for this alternate code do not take up much space, I would just leave it like this and not bother. But if you don't want to waste RAM with those variables once the alternate code is not used anymore, then you can define yet another section for them. That would be for instance, a '.data2' section, that you could arrange to start at the same location as the '.data' one, so they would overlay. I think this is possible in linker scripts, but I'm not too sure the linker would let you create overlaid sections like this.

A cleaner approach, IMO, would be to link this alternate code separately, and then program both object codes on the target specifying the right start address for each to the programmer.

 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
I am having a bit of trouble, with the linker saying it can't find loader.o.

I wonder if there is a name conflict. The actual file is called KDE_loader.c (and it is KDE_loader.o it says it can't find) and the function within it is loader(). I am not sure what happens if you have e.g. loader.c containing loader() but I am avoiding that.

This is my linker script now

Code: [Select]

/* Entry Point */
ENTRY(Reset_Handler)

/* Reference loader to ensure that is gets linked-in */
EXTERN(KDE_loader)    

/* Highest address of the main stack */
 /*  _estack = 0x20020000;  */    /* stack in 128K RAM */
 _estack = 0x10010000;    /* stack in  64k CCM - note: configTOTAL_HEAP_SIZE + min_stack_size must not exceed 64k) */
 
/* top of RAM for _sbrk - top of heap check */
 _top = 0x20020000;

/* Heap and stack sizes */
_Min_Heap_Size  = 0xa000;     /* 40k heap - min size; it can grow to end of main RAM  */
_Min_Stack_Size = 0x4000;     /* 16k stack - in CCM */

/* Specify the memory areas */
/* CCMRAM added PH 12/5/2021 - cannot use with DMA */
MEMORY
{
  FLASH (rx)      : ORIGIN = 0x08000000, LENGTH = 1024K
  RAM (xrw)       : ORIGIN = 0x20000000, LENGTH = 128K
  MEMORY_B1 (rx)  : ORIGIN = 0x60000000, LENGTH = 0K
  CCMRAM (rw)     : ORIGIN = 0x10000000, LENGTH = 64K
}

/* Define output sections */
SECTIONS
{
/* loader_bss and loader sections must come first, in order to override the wildcards in subsequent sections */
 loader_bss _loader_end : {
  _loader_bss_start = .;
  KDE_loader.o(.bss*)
  . = ALIGN(4);
  _loader_bss_end = .;
 }
 KDE_loader 0x20000000 : AT(_loader_loadaddr) {
  _loader_start = .;
  KDE_loader.o
  . = ALIGN(4);
  _loader_end = .;
 }
 _loader_size = SIZEOF(KDE_loader);

  /* The startup code goes first into FLASH */
  .isr_vector :
  {
    . = ALIGN(4);
    KEEP(*(.isr_vector)) /* Startup code */
    . = ALIGN(4);
  } >FLASH

  /* The program code and other data goes into FLASH */
  .text :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _etext = .;        /* define a global symbols at end of code */
  } >FLASH


   .ARM.extab   : { *(.ARM.extab* .gnu.linkonce.armextab.*) } >FLASH
    .ARM : {
    __exidx_start = .;
      *(.ARM.exidx*)
      __exidx_end = .;
    } >FLASH

  .preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array*))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  } >FLASH
  .init_array :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT(.init_array.*)))
    KEEP (*(.init_array*))
    PROVIDE_HIDDEN (__init_array_end = .);
  } >FLASH
  .fini_array :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(.fini_array*))
    KEEP (*(SORT(.fini_array.*)))
    PROVIDE_HIDDEN (__fini_array_end = .);
  } >FLASH

 
  /* Initialized data sections goes into RAM, load LMA copy after code */
  .data :
  {
 /*   . = ALIGN(4); */
    _sdata = .;        /* create a global symbol at data start */
    *(.data)           /* .data sections */
    *(.data*)          /* .data* sections */

    . = ALIGN(4);
    _edata = .;        /* define a global symbol at data end */
  } >RAM AT >FLASH

 /* used by the startup to initialize data */
  _sidata = LOADADDR(.data);

 /* dummy placeholder in flash for loader section, to count flash usage */
 .KDE_loader : {
  . = . + SIZEOF(KDE_loader);
 } AT >FLASH
 _loader_loadaddr = LOADADDR(.KDE_loader);

  /* Uninitialized data section */
  . = ALIGN(4);
  .bss :
  {
    /* This is used by the startup in order to initialize the .bss secion */
    _sbss = .;         /* define a global symbol at bss start */
    __bss_start__ = _sbss;
    *(.bss)
    *(.bss*)
    *(COMMON)

    . = ALIGN(4);
    _ebss = .;         /* define a global symbol at bss end */
    __bss_end__ = _ebss;
  } >RAM

  /* The heap ends up after BSS in main RAM */
  /* This also checks that the top of the heap doesn't hit the bottom of the stack i.e. how much RAM left */
  /* User_heap_stack section, used to check that there is enough RAM left */
 
  /* ._user_heap_stack : */
  .main_heap :
  {
    . = ALIGN(8);
    PROVIDE ( end = . );
    PROVIDE ( _end = . );
    . = . + _Min_Heap_Size;
 /*   . = . + _Min_Stack_Size; */  /* PH 14/5/2021 stack is in CCM, not here */
    . = ALIGN(8);
   } >RAM
  /*   } >CCMRAM */


  /* MEMORY_bank1 section, code must be located here explicitly            */
  /* Example: extern int foo(void) __attribute__ ((section (".mb1text"))); */
  /* Not used 14/5/2021 - was apparently used for LCD display on ST dev kit */
  .memory_b1_text :
  {
    *(.mb1text)        /* .mb1text sections (code) */
    *(.mb1text*)       /* .mb1text* sections (code)  */
    *(.mb1rodata)      /* read-only data (constants) */
    *(.mb1rodata*)
  } >MEMORY_B1
 
  /* CCM-RAM section
  *
  * IMPORTANT NOTE!
  * If variables placed in this section must be zero initialized,
  * the startup code needs to be modified to initialize this section. 
  * Done PH 12/5/2021
  */
  .ccmram :
  {
    . = ALIGN(4);
    _sccmram = .;       /* create a global symbol at ccmram start */
    *(.ccmram)
    *(.ccmram*)
   
    . = ALIGN(4);
    _eccmram = .;       /* create a global symbol at ccmram end */
  } >CCMRAM

  /* Remove information from the standard libraries */
  /DISCARD/ :
  {
    libc.a ( * )
    libm.a ( * )
    libgcc.a ( * )
  }

  .ARM.attributes 0 : { *(.ARM.attributes) }
}


I did check it contains the suggested mods, using file compare in notepad++.

One regular issue with Cube IDE is that it doesn't pick up some edits so one has to do a Clean Project / Build project etc.
« Last Edit: July 14, 2021, 09:12:15 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online gf

  • Super Contributor
  • ***
  • Posts: 1173
  • Country: de

I wonder if there is a name conflict. The actual file is called KDE_loader.c (and it is KDE_loader.o it says it can't find) and the function within it is loader(). I am not sure what happens if you have e.g. loader.c containing loader() but I am avoiding that.

The function name does not need to be the same as the module name.

This statement references the function name:
Code: [Select]
EXTERN(KDE_loader)

If the function name is loader() then it should be:
Code: [Select]
EXTERN(loader)

I've just added the EXTERN to be on the safe side, in order that the section does not get garbage collected when it is not referenced otherwise (depending on whether you invoke the linker with the option --gc-sections).
Eventually it is supposed to be referenced from main() anyway, when you call loader() from main().

Edit:
I suggest to invoke the linker with option -Map=outputfilename.map and to check the map file for correct assignment of the sections. Some fine tuning may be necessary, depending on the secions actually present in KDE_loader.o.

The section assignment in the linker script was a bit tricky. Obviously the order of the output sections matters for resolving the wildcards. So I had to put the loader_bss and loader sections at the beginning, although their load address is not known before .data has been placed. As workaround I introduced the dummy placeholder section. Aim was that loader.c can still use the regular section names (.text, data, .bss,...), so that not each function and each global variable need to be annotated with __attribute__().
« Last Edit: July 14, 2021, 09:43:10 pm by gf »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
That's done but still same error.

Sounds like KDE_loader.c is not getting compiled, which is bizzare.

It is referenced from main()

   // At this point, interrupts should still be disabled
   // Execute loader

   extern void loader(void);
   loader();

Just seen your edit. Will need to dig around to find where that is.
« Last Edit: July 14, 2021, 09:47:29 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online gf

  • Super Contributor
  • ***
  • Posts: 1173
  • Country: de
Sorry, don't know your IDE and don't know what you need to do that it gets compiled. Or does the .o file possibly exist, but in the wrong direectory?
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
The .o file exists, in the same dir as all the others, so I guess it is not getting supplied to the linker. Yeah, I have no idea how this IDE works at that level, either, and google turns up loads of hits on others having the same problem. I do know the IDE is buggy - like everything from ST - in that one sometimes needs to do a Clean project / Build project / Reindex project for no obvious reason. I will dig around... In the distant past, all command line, makefiles, etc, a linker would be supplied, on its command line, with a file containing the files to link, and I guess this one is not getting included in that file. And I have added lots of new files to this project, so the basic method must be working.

arm-none-eabi-g++ -o "KDE.elf" @"objects.list"   -mcpu=cortex-m4 -T"C:\KDE\Project1\LinkerScript.ld" --specs=nosys.specs -Wl,-Map="KDE.map" -Wl,--gc-sections -static  -mfpu=fpv4-sp-d16 -mfloat-abi=hard -mthumb -u _printf_float -u _scanf_float -Wl,--start-group -lc -lm -lstdc++ -lsupc++ -Wl,--end-group

and objects.list contains the .o file. I wonder if there is something in the linker .ld file which is causing it to be excluded? It is the only source file which is explicitly named in the linkfile script, because the code is being linked to run at a different address.

Edit: someone suggested putting a * before the filename in the linker .ld file, and it links! No idea why.



Apparently, the reference for the above is here: https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_chapter/ld_3.html

The code unfortunately bombs. It does a hard fault on the memcpy. OK, maybe it is not a great idea to load the code at 0x20000000 because that is the very base of RAM and even though I can't think of what might be there, at this early point in main(), I tried different addresses in this



but all that I have tried produce section overlap errors e.g.



The loader cannot be that big... from 0x2001000 to 0x20098f2b.

Practically any value other than 0x20000000 produces some overlap error e.g. with 0x20000100:



Interestingly looking in the .map file, it really is huuuge
                0x0000000000088f2c                _loader_size = SIZEOF (KDE_loader)

Code: [Select]
loader_bss      0x000000002008902c        0x0
                0x000000002008902c                _loader_bss_start = .
 *KDE_loader.o(.bss*)
                0x000000002008902c                . = ALIGN (0x4)
                0x000000002008902c                _loader_bss_end = .

KDE_loader      0x0000000020000100    0x88f2c load address 0x000000000803b6a0
                0x0000000020000100                _loader_start = .
 *KDE_loader.o()
 .data          0x0000000020000100       0x18 src/KDE_loader.o
                0x0000000020000100                GPIO_PORT_X
 .rodata        0x0000000020000118        0xc src/KDE_loader.o
                0x0000000020000118                GPIO_PIN_X
 .text.HAL_GPIO_WritePin
                0x0000000020000124       0x32 src/KDE_loader.o
 *fill*         0x0000000020000156        0x2
 .text.KDE_LED_On
                0x0000000020000158       0x38 src/KDE_loader.o
 .text.KDE_LED_Off
                0x0000000020000190       0x38 src/KDE_loader.o
 .text.hang_around
                0x00000000200001c8       0x26 src/KDE_loader.o
 .text.loader   0x00000000200001ee       0x1c src/KDE_loader.o
                0x00000000200001ee                loader
 .debug_info    0x000000002000020a      0x372 src/KDE_loader.o
 .debug_abbrev  0x000000002000057c      0x193 src/KDE_loader.o
 .debug_aranges
                0x000000002000070f       0x40 src/KDE_loader.o
 .debug_ranges  0x000000002000074f       0x30 src/KDE_loader.o
 .debug_macro   0x000000002000077f       0xc8 src/KDE_loader.o
 .debug_macro   0x0000000020000847      0x112 src/KDE_loader.o
 .debug_line    0x0000000020000959      0x50e src/KDE_loader.o
 .debug_str     0x0000000020000e67    0x880ae src/KDE_loader.o
                                      0x88234 (size before relaxing)
 .comment       0x0000000020088f15       0x53 src/KDE_loader.o
                                         0x54 (size before relaxing)

and these are the values when loading at 0x20000000. No errors this time, but the code crashes in memcpy() - unsurprisingly given the debug_str is 0x880ae in size. No idea where these huge items are coming from

Code: [Select]
KDE_loader      0x0000000020000000    0x88f2c load address 0x000000000803b6a0
                0x0000000020000000                _loader_start = .
 *KDE_loader.o()
 .data          0x0000000020000000       0x18 src/KDE_loader.o
                0x0000000020000000                GPIO_PORT_X
 .rodata        0x0000000020000018        0xc src/KDE_loader.o
                0x0000000020000018                GPIO_PIN_X
 .text.HAL_GPIO_WritePin
                0x0000000020000024       0x32 src/KDE_loader.o
 *fill*         0x0000000020000056        0x2
 .text.KDE_LED_On
                0x0000000020000058       0x38 src/KDE_loader.o
 .text.KDE_LED_Off
                0x0000000020000090       0x38 src/KDE_loader.o
 .text.hang_around
                0x00000000200000c8       0x26 src/KDE_loader.o
 .text.loader   0x00000000200000ee       0x1c src/KDE_loader.o
                0x00000000200000ee                loader
 .debug_info    0x000000002000010a      0x372 src/KDE_loader.o
 .debug_abbrev  0x000000002000047c      0x193 src/KDE_loader.o
 .debug_aranges
                0x000000002000060f       0x40 src/KDE_loader.o
 .debug_ranges  0x000000002000064f       0x30 src/KDE_loader.o
 .debug_macro   0x000000002000067f       0xc8 src/KDE_loader.o
 .debug_macro   0x0000000020000747      0x112 src/KDE_loader.o
 .debug_line    0x0000000020000859      0x50e src/KDE_loader.o
 .debug_str     0x0000000020000d67    0x880ae src/KDE_loader.o
                                      0x88234 (size before relaxing)
 .comment       0x0000000020088e15       0x53 src/KDE_loader.o
                                         0x54 (size before relaxing)
 .debug_frame   0x0000000020088e68       0x90 src/KDE_loader.o
 .ARM.attributes
                0x0000000020088ef8       0x34 src/KDE_loader.o
                0x0000000020088f2c                . = ALIGN (0x4)
                0x0000000020088f2c                _loader_end = .
                0x0000000000088f2c                _loader_size = SIZEOF (KDE_loader)

Of course, lots of people have been here before e.g. https://stackoverflow.com/questions/52439702/why-object-files-debug-str-so-big

but I have no idea how to sort this. I know debug data bloats the code a bit but this is huge.

Further digging finds that debug_str is everywhere in the map file, in massive sizes, but clearly is just stored locally, for debugging purposes in the IDE. In this case, kde_loader.c, it is getting included in my binary file. Or more likely just somehow its size is incorrectly being added into the loader size.

Anyway, changing the memcpy size to 16k stops it crashing



but it hangs when it calls loader(). The assembler shows that the address of loader() is clearly wrong - it is still in flash (0x803a730)

« Last Edit: July 15, 2021, 05:10:35 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online gf

  • Super Contributor
  • ***
  • Posts: 1173
  • Country: de
When I tested the linker script entries, all .o files were in the current directory. I think that's the reason it worked without the * wildcard in front of the filename.

Since the debug sections are not required for running the code, they don't need to be included in the KDE_loader segment.
Instead of including all input sections of KDE_loader.o, you can be more selective:

Code: [Select]
KDE_loader 0x20000000 : AT(_loader_loadaddr) {
  _loader_start = .;
  *KDE_loader.o(.text*)
  *KDE_loader.o(.rodata*)
  *KDE_loader.o(.data*)
  *KDE_loader.o(.ARM.attributes)
  . = ALIGN(4);
  _loader_end = .;
 }

That's what I meant with "fine tuning" the linker script after inspecting the map.
Not sure if .ARM.attributes is needed, but it is not so big.

Quote
The assembler shows that the address of loader() is clearly wrong - it is still in flash (0x803a730)

The called address here is not loader, but __loader_veneer. This is obviously a long jump trampoline generated by the linker, which still resides in close distance the caller (i.e. flash is correct), and the code at the address __loader_veneer does the actual jump to loader() then.

Can you single-step the next few instructions?

Edit:

In order to avoid the linker-generated veneer, you could try to change the extern declaration of loader() in main.c from
Code: [Select]
extern void loader();
to
Code: [Select]
extern void loader() __attribute__((long_call));

My understanding is that gcc generates an indirect call to loader() then, avoiding the need for the veneer. At the end it still should not make much diference.
« Last Edit: July 15, 2021, 08:35:19 pm by gf »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf