Author Topic: A question on GCC .ld linker script syntax  (Read 13447 times)

0 Members and 1 Guest are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
A question on GCC .ld linker script syntax
« on: March 12, 2023, 06:55:15 am »
This is to do with initialising DATA sections. This stuff is stored in FLASH and copied to RAM at startup.

This is the bit of the linkerscript

Code: [Select]

/* Initialized data sections go into RAM, load LMA copy after code */
/* This stuff is copied from FLASH to RAM by the startupxxx.s code */
/* This actually collects *all* "data" sections for the whole product, which is not ideal */
/* The "data" sections from the non-boot-block code should be stored outside the boot block */
/* This wastes about 3k in the boot block */
  .boot_data :
  {
    . = ALIGN(4);
    _sdata = .;        /* create a global symbol at data start */
    *(.data)           /* .data sections */
    *(.data*)          /* .data* sections */
      . = ALIGN(4);
    _edata = .;        /* define a global symbol at data end */
  } >RAM  AT >FLASH_BOOT

  /* used by the startup .s code to initialize data */
  _sidata = LOADADDR(.boot_data);

and this is the startup.s code which sets it up in RAM

Code: [Select]

/* Copy the data segment initializers from flash to RAM */ 
/* This actually sets up only boot block DATA but that happens to contain all DATA sections
   for the whole unit. These come to only about 3k, which is OK because there is plenty of space
   in the 32k boot block but is not ideal.
 */

  movs  r1, #0
  b  LoopCopyDataInit

CopyDataInit:
  ldr  r3, =_sidata
  ldr  r3, [r3, r1]
  str  r3, [r0, r1]
  adds  r1, r1, #4
LoopCopyDataInit:
  ldr  r0, =_sdata
  ldr  r3, =_edata
  adds  r2, r0, r1
  cmp  r2, r3
  bcc  CopyDataInit

I found that the label "boot_data" performs no function. It is not referenced anywhere. Originally it was "data" and I think that was just an ambiguity. The only labels from the above linkerscript are sdata edata and sidata, and I understand this.

The above collects DATA sections (variables initialised to nonzero; not text strings which seem to go elsewhere) from the entire project. This is ok because the DATA from the whole project currently comes to just 3k, but it could be way bigger in the future.

I am after two things:

1) How can I construct the above linker script so it collects DATA only from specific .o files? I need to have two of the above linkerscripts. One which collects DATA from specific .o files (4 of them, and that DATA will be set up in RAM by the above asm code) and the other which collects DATA from all the rest of the project (and that DATA will be set up in RAM by C code in 2) below)

2) C code for performing the same function as the asm code above. I have the following in another part of the project and am probably near to working this out.

Code: [Select]

extern char _loader_start;
extern char _loader_end;
extern char _loader_loadaddr;
extern char _loader_bss_start;
extern char _loader_bss_end;

// Copy loader code to RAM. This also sets up its initialised data.
B_memcpy(&_loader_start, &_loader_loadaddr, &_loader_end - &_loader_start);

// Clear loader's BSS and COMMON
B_memset(&_loader_bss_start, 0, &_loader_bss_end - &_loader_bss_start);

// See comments in loader.c for why the long call.

extern void loader_entry() __attribute__((long_call));
loader_entry();

// never get here (loader always reboots)
for (;;);
}

The above copies over not just DATA but code and data. The corresponding linker script is

Code: [Select]
 
  /* loader.o - goes ===LAST=== in the boot block, because after the block below
     the location pointer ends up pointing into RAM_L. You can check the boot block code
     usage by checking _loader_end_in_flash, and _loader_size, in the .map file */
     
  . = ALIGN(4);
  _loader_loadaddr = .; /* this is where the loader start is in FLASH */
 
  .loader.o :
  {
    . = ALIGN(4);
    _loader_start = .; /* this is where the loader start is in RAM */
    KEEP(*(.loader.o))
    *loader.o (.text .text*)
      . = ALIGN(4);
  } >RAM_L AT>FLASH_BOOT
 
  /* These variables are not necessarily used but can be inspected in the .map file */
  _loader_size = _loader_end - _loader_start;
_loader_end_in_flash = _loader_loadaddr + _loader_size;

/* Place loader DATA next (right after the loader code). This is data initialised to nonzero.
   Previously we had
   *loader.o (.text .text* .rodata .rodata* .data .data*)
   in the above, but that placed DATA *before* the loader code, which is dumb, even
   though it would not really matter because the loader is entered via a named function */

  _loader_rodata_data_start :
  {
  . = ALIGN(4);
  *loader.o(.rodata .rodata* .data .data*)
  . = ALIGN(4);
  _loader_end = .; /* this is where the loader end is in RAM */
} >RAM_L AT>FLASH_BOOT

  /* Place loader BSS & COMMON right after the loader code+DATA, in RAM. This is zeroed in b_main.c */

_loader_bss_start :
  {
  . = ALIGN(4);
  _loader_bss_start = .;
  *loader.o(.bss* .common COMMON*)
  . = ALIGN(4);
  _loader_bss_end = .;
} >RAM_L

I have a feeling that the answer to 1) may be to prefix the script with the name(s) of the .o file(s) but I am not sure - e.g. the ".loader.o :" below

Code: [Select]
  .loader.o :
  {
    . = ALIGN(4);
    _loader_start = .; /* this is where the loader start is in RAM */
    KEEP(*(.loader.o))
    *loader.o (.text .text*)
      . = ALIGN(4);
  } >RAM_L AT>FLASH_BOOT

And here is another bit where I think the label ".text :" is just ambiguous; you could call it "fred"...

Code: [Select]
/* This collects all other stuff, which gets loaded into FLASH after main.o above */
 
  .text :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _etext = .;        /* define a global symbol at end of code */
} >FLASH_APP

Thank you very much for any help.
« Last Edit: March 12, 2023, 07:03:57 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #1 on: March 12, 2023, 09:26:52 am »
1) How can I construct the above linker script so it collects DATA only from specific .o files? I need to have two of the above linkerscripts. One which collects DATA from specific .o files (4 of them, and that DATA will be set up in RAM by the above asm code) and the other which collects DATA from all the rest of the project (and that DATA will be set up in RAM by C code in 2) below)
Let's say the four files are named boot-*.o:

You replace the *(.data) and *(.data*) in the first one with
    boot-*.o (.data .data*)
and in the second one with
    EXCLUDE_FILE (boot-*.o) *(.data .data*)

Or, if you want to keep the data in section order, first one with
    boot-*.o (.data)
    boot-*.o (.data*)
and the second one with
    EXCLUDE_FILE (boot-*.o) *(.data)
    EXCLUDE_FILE (boot-*.o) *(.data*)

Note that boot-*.o are still subject to gc.  If you want to ensure they are included and not subject to gc, replace the first one with
   KEEP(boot-*.o) boot-*.o (.data .data*)
or
   KEEP(boot-*.o) boot-*.o (.data)
   KEEP(boot-*.o) boot-*.o (.data*)

2) C code for performing the same function as the asm code above
You know the assembly is more than a bit silly?  Why not
Code: [Select]
    ldr   r1, =_sram  /* the start address of RAM, actually */
    ldr   r2, =_sdata
    ldr   r3, =_edata
    subs  r3, r3, r2   /* number of bytes to copy */
    ble  .done
    subs  r3, #4
    blt  .done
.loop:
    ldr  r0, [ r2, r3 ]
    str  r0, [ r1, r3 ]
    subs  r3, #4
    bge  .loop
.done:
This exploits the fact that after subtraction, condition code flags are set.  r3 is the decreasing index, r0 the scratch register, r1 the destination (start of RAM), and r2 the source.

Also, you set up a .boot_data section, with _sdata having the address of the contents of that section, so _sidata = LOADADDR(.boot_data) == _sdata.

The straightforward C version of above would be
Code: [Select]
extern char        _sram;
extern const char  _sdata;
extern const char  _edata;

unsigned int *const        dst = (unsigned int *)&_sram;
const unsigned int *const  src = (const unsigned int *)&_sdata;
unsigned int  len = (unsigned int)(&_edata - &sdata);
while (len >= sizeof (unsigned int)) {
    len -= sizeof (unsigned int);
    *(dst + len) =*(src + len);
}

Note: I didn't check any of this, so expect typos.
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #2 on: March 12, 2023, 09:46:06 am »
That's neat - thank you. Also it's nice to avoid library calls so early in the code, although in this case it is OK. I will change it to yours though.

Incidentally, would the following work?

Code: [Select]

/* Initialized data sections for boot block. These go into RAM, load LMA copy after code */
/* This stuff is copied from FLASH to RAM by the startupxxx.s code */
.boot_data :
  {
    . = ALIGN(4);
    _s_boot_data = .;        /* create a global symbol at data start */
    *b_mod1.o (.data .data*)      /* .data sections */
    *b_mod2.o (.data .data*)        /* .data sections */
    *b_mod3.o (.data .data*) /* .data sections */
    *b_mod4.o (.data .data*)        /* .data sections */
      . = ALIGN(4);
    _e_boot_data = .;        /* define a global symbol at data end */
  } >RAM  AT >FLASH_BOOT
  /* used by the startup .s code to initialize data */
  _si_boot_data = LOADADDR(.boot_data);
 
 
/* Initialized data sections for rest of unit. These go into RAM, load LMA copy after code */
/* This stuff is copied from FLASH to RAM by C code in the main stub */
.all_nonboot_data :
  {
    . = ALIGN(4);
    _s_nonboot_data = .;        /* create a global symbol at data start */
    *(.data .data*)      /* .data sections */
      . = ALIGN(4);
    _e_nonboot_data = .;        /* define a global symbol at data end */
  } >RAM  AT >FLASH_APP
  /* used by the main.c code to initialize data */
  _si_nonboot_data = LOADADDR(.all_nonboot_data);

IOW, all DATA sections not collected by the first lump (the 4 named .o files) get collected by the *(.data .data*) in the second lump.

I realise the asm code is crap, it was pointed out previously, but it came from ST and I never got around to changing it.

For the "2nd lump" I have

Code: [Select]
// Initialise DATA

extern char _s_nonboot_data;
extern char _e_nonboot_data;
extern char _si_nonboot_data;
void * memcpy (void *__restrict, const void *__restrict, size_t);  // avoid #including string.h
memcpy(&_s_nonboot_data, &_si_nonboot_data, &_e_nonboot_data - &_s_nonboot_data);

BTW the reason to avoid including any .h files in this .c module is because it is linked to a specific address and sometimes one finds .h files generate code (macros etc)!

I also wonder why use data and data* and common COMMON common* but not COMMON*. Is there some intellectual work behind this or was somebody recycling some 1978 unix code? I see no data* or data1 or data2 or common1 etc.
« Last Edit: March 13, 2023, 05:23:49 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #3 on: March 12, 2023, 11:18:37 am »
Oopsie!

I didn't notice the AT in the section output section, so I actually recommend the following:
Code: [Select]
.boot_data ALIGN(4) {
    _sram_boot_data_start = .;
    *b_mod?.o (.data .data*);
    . = ALIGN(4);
    _sram_boot_data_stop = .;
} >RAM AT >FLASH_BOOT
_flash_boot_data_start = LOADADDR(.boot_data);
The difference being that the symbols provided by the linker are virtual addresses, but LOADADDR() yields the load memory address.

Thus,
Code: [Select]
extern char _sram_boot_data_start, _sram_boot_data_stop;
extern const char _flash_boot_data_start;

memcpy32((unsigned int *)&_sram_boot_data_start,
         (const unsigned int *)&_flash_boot_data_start,
         (unsigned int)(&_sram_boot_data_stop - &_sram_boot_data_start) / 4);
where memcpy32(unsigned int *dst, const unsigned int *src, unsigned int n) copies n 32-bit words from src to dst, with both pointers guaranteed to be aligned (by the linker script above), and non-overlapping.

Incidentally, would the following work?
.boot_data would contain only the .data sections from the four files, yes.  You could also use just
    *b_mod?.o (.data .data*)
or
    *b_mod[0-9].o (.data .data*)
That is still subject to gc by the linker, so do also consider KEEP (*b_mod?.o) *b_mod?.o (.data .data*) to ensure all code in them gets included.

However, .all_nonboot_data would contain a copy of the above.  To avoid that, you could use something like
Code: [Select]
.nonboot_data ALIGN(4) {
    _sram_nonboot_data_start = .;
    EXCLUDE_FILE(*b_mod?.o) *(.data .data*);
    . = ALIGN(4);
    _sram_nonboot_data_stop = .;
} >RAM AT >FLASH_BOOT
_flash_nonboot_data_start = LOADADDR(.nonboot_data);
Edited: as peter-h points out in a later message, in the same linker script those sections have already been "consumed", so the EXCLUDE_FILE(*b_mod?.o) is not needed.

I realise the asm code is crap, it was pointed out previously, but it came from ST and I never got around to changing it.
No worries.  It's only going to be run once, so it's nothing to worry about.

For the "2nd lump" I have
Hmm.  Adapting to my above suggestion, memcpy32(&_sram_nonboot_data_start, &_flash_nonboot_data_start, (unsigned int)(&_sram_nonboot_data_stop - &_sram_nonboot_data_start)/4) should work, which basically matches the memcpy().

Just to check (ld basic concepts), virtual memory addresses (obtained by referencing . or using ADDR(section)) are what the code sees at run time, i.e. RAM addresses; and load memory addresses are the output section addresses as specified by AT, i.e. FLASH addresses.

If you use the Thumb instruction set on Cortex-M4 or -M7, then
Code: [Select]
memcpy32:
    cmp  r2, #0
    ble  .L1
    add  r3, r1, r2, lsl #2
    add  r0, r0, r2, lsl #2
.L3:
    ldr  r2, [r3, #-4]!
    cmp  r3, r1
    str  r2, [r0, #-4]!
    bne  .L3
.L1:
    bx   lr
should work pretty well; it is the assembly equivalent of
Code: [Select]
void memcpy32(unsigned int *dst,
              const unsigned int *src,
              int n) {
    while (n-->0)
        dst[n] = src[n];
}
when the compiler doesn't decide to replace it with memcpy() or memmove().

You can compile assembly to an object file using gcc or g++; it will detect the language correctly based on filename suffix .s.
See arm-none-eabi-gcc-11.2.1 -O2 -march=armv7e-m -mtune=cortex-m7 -mthumb at Compiler Explorer for the above.

What I would actually do, is use the following in my C sources:
Code: [Select]
// Copy 'n' 32-bit words starting at 'src' to 'dst'.
// 'dst' and 'src' must be properly aligned.
__attribute__((used))
void memcpy32(unsigned int *dst,
              const unsigned int *src,
              int n) {
    if (n > 0)
        asm volatile ( "\n"
                       ".Lmemcpy32b_loop:\n\t"
                       "ldr r2, [%[src]], #4\n\t"
                       "str r2, [%[dst]], #4\n\t"
                       "cmp %[src], %[end]\n\t"
                       "blt .Lmemcpy32b_loop\n\t"
                     :
                     : [dst] "r" (dst),
                       [src] "r" (src),
                       [end] "r" (src + n)
                     : "r2", "memory" );
}

I also wonder why use data and data* and common COMMON common* but not COMMON*.
Section names are case sensitive AFAIK, so common and COMMON are different sections.

COMMON is a special section name for common symbols.
« Last Edit: March 12, 2023, 12:39:59 pm by Nominal Animal »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #4 on: March 12, 2023, 11:42:15 am »
Quote
However, .all_nonboot_data would contain a copy of the above

I find that really surprising because the linker scripts I have seen appear to assume that once stuff from one .o file has been collected into a named section (e.g. FLASH_APP) then it is gone.

Quote
*b_mod?.o (.data .data*)

Can one use b_*.o to collect all b_*.o files? That isn't a proper regexp, hence I am not sure. The actual filenames I didn't post but they all start with b_.

Quote
That is still subject to gc by the linker, so do also consider KEEP

I do have a KEEP on all code i.e. TEXT. That was always there. Is it needed for DATA (i.e. nonzero-initialised variables?

KEEP (*b_*.o) *b_*.o (.data .data*) doesn't work. By inspection of similar code, it needs to be KEEP (*b_*.o (.data .data*)).

Getting there, code running as it should be, but I am trying to also understand it :)


« Last Edit: March 12, 2023, 12:20:09 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #5 on: March 12, 2023, 12:37:46 pm »
I find that really surprising because the linker scripts I have seen appear to assume that once stuff from one .o file has been collected into a named section (e.g. FLASH_APP) then it is gone.
Hm.  Waitasecond.  Indeed, you're right.  :-[

Quote
*b_mod?.o (.data .data*)
Can one use b_*.o to collect all b_*.o files? That isn't a proper regexp, hence I am not sure. The actual filenames I didn't post but they all start with b_.
Yes.  These are glob-style wildcard patterns, not regexes.

KEEP (*b_*.o) *b_*.o (.data .data*) doesn't work. By inspection of similar code, it needs to be KEEP (*b_*.o (.data .data*)).
Right,  :-[ I should've checked.

It really isn't my day today, it seems.

Right now I am getting weird issues with invalid opcodes (0xffffffff) inserted between C source lines.
Section alignment?

In the output (text) sections, try adding FILL(0xBF00BF00), or add =0xBF00BF00 output section attribute (before > or AT>).  0xBF00 is the Thumb encoding for NOP.  This means that when .text sections from different object files are concatenated, the alignment (per the object file section alignment) is done with 0xBF00BF00 (NOP; NOP) instead of 0xFFFFFFFF.
« Last Edit: March 12, 2023, 12:41:02 pm by Nominal Animal »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #6 on: March 12, 2023, 02:18:25 pm »
Interestingly, if I uncomment the 2nd KEEP below, I get loads of undefined symbol linker errors

Code: [Select]

/* Initialized data sections for boot block. These go into RAM, load LMA copy after code */
/* This stuff is copied from FLASH to RAM by the startupxxx.s code */
.boot_data :
  {
    . = ALIGN(4);
    _s_boot_data = .;        /* create a global symbol at data start */
    KEEP (*b_*.o (.data .data*))
    *b_*.o (.data .data*)      /* DATA sections */
      . = ALIGN(4);
    _e_boot_data = .;        /* define a global symbol at data end */
  } >RAM  AT >FLASH_BOOT

  /* used by the startup .s code to initialize data */
  _si_boot_data = LOADADDR(.boot_data);
 
 
/* Initialized data sections for rest of unit. These go into RAM, load LMA copy after code */
/* This stuff is copied from FLASH to RAM by C code in the main stub */
.all_nonboot_data :
  {
    . = ALIGN(4);
    _s_nonboot_data = .;        /* create a global symbol at data start */
    /*KEEP (*(.data .data*))*/
    *(.data .data*)      /* .data sections */
      . = ALIGN(4);
    _e_nonboot_data = .;        /* define a global symbol at data end */
  } >RAM  AT >FLASH_APP

  /* used by the main stub C code to initialize data */
  _si_nonboot_data = LOADADDR(.all_nonboot_data);

Looking at other KEEP usage, it seems to be used universally for code (TEXT) and ambiguously for other stuff.

The b_*.o etc wildcards do work but you have to be extremely careful with them because if you have a module containing any b_* function inside, and you forgot to include that module earlier (due to e.g. a typo in its name), the linker will pick out the b_*() functions out of it and will dump them somewhere, anywhere, in the remaining code. Just spent a few hours chasing what was basically a typo, but the whole thing compiled with no errors because the linker found those b_*() functions. The .map file eventually revealed boot block functions at addresses outside the boot block.

It's obvious to me why the absolutely crap .ld file syntax is acceptable to people. They waste their life on the linker script precisely once and never again.
« Last Edit: March 13, 2023, 05:24:21 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 851
  • Country: es
Re: A question on GCC .ld linker script syntax
« Reply #7 on: March 13, 2023, 12:50:40 am »
Linker is designed to link and deduplicate, why blame it for not being tailored for the opposite task again and again? The intended way of performing your two call graphs separation task is as simple as invoking the linker two times - one for the bootloader, another one for the main app, that would guarantee bootloader isolation automatically, without ever looking into .map.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #8 on: March 13, 2023, 02:21:30 am »
Linker is designed to link and deduplicate, why blame it for not being tailored for the opposite task again and again? The intended way of performing your two call graphs separation task is as simple as invoking the linker two times - one for the bootloader, another one for the main app, that would guarantee bootloader isolation automatically, without ever looking into .map.
I must say I agree.  I do not understand why you insist on combining both the bootloader and the normal code in the same target binary.

If the bootloader part provides symbols useful to the application part, perhaps functions like memcpy32(), you can always pick their addresses from the target binary (using objdump and a bit of scripting) or from the .map file, and then declare them in a header file and define in the linker script.  It is easily automated using bash/sed/awk or python or a number of other scripting tools; just make sure you use export LANG=C LC_ALL=C so all tools use the default character set.  I do such stuff (multi-stage builds with Makefiles running external helper scripts) all the time, and it just makes the builds easier to control, and more robust.  More steps, yes, but fewer inter-dependencies.  That's also why I'm so rusty with the linker script syntax: I rarely need to do the weird stuff, so I always need to consult the LD manual!

I mean, Teensy 4.0 linker script is 109 lines, Teensy 4.1 linker script is 118 lines, and MicroMod Teensy linker script is 109 lines.  The NXP i.MX RT1062 processor in it implements a very similar Cortex-M7 (with hardware single- and double-precision floating point).
Of course, that is just the application part, because the bootloader on these Teensies is actually on a separate NXP MKL02 chip in control of the RT1062, and is not field-updateable.  In practice, it can be considered a pre-set JTAG loader or something like that.  But it shows how simple typical linker scripts are.
(In case you wonder, 4.1 supports one or two 8 Mbyte PSRAM chips, which are mapped at ERAM if soldered in.  It is detected by startup code in the Teensyduino core at runtime.)

For each bootloader version, application developers would have a corresponding header file (in addition to the CMSIS headers or whatever you provide for the board functionality), describing the additional interfaces the bootloader provides; plus the linker script to be used for the application part (and defining the addresses of e.g. function symbols accessible to the application part).
To target different bootloader versions, application developers would build their code against each header file and linker script.

Do note that we're only telling you how we'd do things, exactly because doing them the way you are doing now leads to the kinds of problems (and much increased complexity) you are having now; and not telling you what to do.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9653
  • Country: fi
Re: A question on GCC .ld linker script syntax
« Reply #9 on: March 13, 2023, 06:11:53 am »
I do not understand why you insist on combining both the bootloader and the normal code in the same target binary.

Even if you want the bootloader and application in same binary, you don't need to use linker for that.

Keep bootloader as a completely separate project and just merge the binaries before programming, for example this is what I use lately:
Code: [Select]
srec_cat ../bootloader/bootloader.hex -intel fw_a.hex -intel meta_a.hex -intel -o whole.hex -intel
 
The following users thanked this post: Nominal Animal

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #10 on: March 13, 2023, 07:25:47 am »
I am using two separate linkfiles. One makes one project (including the boot loader) and the other makes one project (which runs from base+32k).

This issue is nothing to do with that. It is to do with LD file syntax and how the flow operates along the file.
« Last Edit: March 13, 2023, 07:37:47 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #11 on: March 13, 2023, 08:43:10 am »
It is to do with LD file syntax and how the flow operates along the file.
Your link files are much more complicated than I'm used to seeing.

Take a gander at the link files I linked to in my last message.  The linked ones are for a very comparable NXP i.MX RT1062 Cortex-M7 processor, and they're all (one per variant) less than 120 lines long, even though they are used from the Arduino environment (which has all those wonkinesses wrt. strings being copied from Flash to RAM because some code expects them to be mutable because they are on AVRs because of the lack of address space support in g++).

As to the operation of the link scripts, they remind me a bit of awk.  Not syntactically, but logically; how the selection/filtering works, with awk input records corresponding to source sections to be processed by a linker script.  I can definitely see similarities, so perhaps playing with awk a bit might help?  :-//
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #12 on: March 13, 2023, 09:09:27 am »
Maybe that's because I am trying to load specific modules in specific places.

If you are just building a single monolithic project then the linkfile can be very simple.

The linkfile which builds the combined (boot+rest) project could have been monolithic but without a "boot block" section you don't get a brick recovery option.

This is too complicated to explain without a lot of detail but I am grateful for the help so far. It seems to be running OK. In fact it's been running fine for 2+ years, 24/7, on several boards, until I did some changes and then I had to revisit a load of stuff :)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #13 on: March 13, 2023, 11:30:56 am »
I have one more Q:

Does the label "code_constants_etc" mean anything here:

Code: [Select]

/* This collects all other stuff, which gets loaded into FLASH after main.o above */
  .code_constants_etc :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _e_code_constants_etc = .;        /* define a global symbol at end of code */
} >FLASH_APP

In other places I have seen "text" used but obviously not just TEXT (executable code) is being collected into FLASH_APP here...

Excessive complexity was mentioned. This is an example of an ex-ST Cube bit which nobody I ever found knew anything about, except that it refers to C++.

Code: [Select]
/* this is for __libc_init_array() */
/* Not sure it is actually used for anything */

.preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array*))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  } >FLASH_APP
  .init_array :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT(.init_array.*)))
    KEEP (*(.init_array*))
    PROVIDE_HIDDEN (__init_array_end = .);
  } >FLASH_APP
  .fini_array :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(.fini_array*))
    KEEP (*(SORT(.fini_array.*)))
    PROVIDE_HIDDEN (__fini_array_end = .);
  } >FLASH_APP
« Last Edit: March 13, 2023, 05:24:07 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 851
  • Country: es
Re: A question on GCC .ld linker script syntax
« Reply #14 on: March 13, 2023, 12:55:32 pm »
Section names like your .code_constants_etc are just section names in output ELF file, nothing relevant for final bin/hex (unless you are doing some more complicated ELF-to-something conversion).

*init_array* stuff is for calling static initializers (functions declared with __attribute__((constructor)), C++ constructors of static objects) from the startup code. They are arrays of function pointers and a full-featured startup code (not ST's half-baked one) iterates through these arrays, calling each function before entering main(). An example:
Code: [Select]
#include <stdio.h>

static int a=5;

//static void before_main() //normal function
static void __attribute__((constructor)) before_main() //executed before main
{
    a = 9;
}

int main(void)
{
    printf("a=%d\n", a);
    return 0;
}
would print "a=9"

> without a "boot block" section you don't get a brick recovery option.
What we are trying to say is this (exactly this, creating an independent recovery boot block) could be done with much less effort just by linking it separately. You've spent lots of time tracking down and separating every function called from the boot block code path I guess, and yet all this can be broken easily by a minor change and needs to be verified again and again.
I've seen several quite funny security vulnerabilities rooting from similar approach - a signed "secure boot block" (verified by BootROM) built together with the "app" it supposed to verify before running, was calling some lib function from that not-verified-yet code :)
« Last Edit: March 13, 2023, 01:12:38 pm by abyrvalg »
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #15 on: March 13, 2023, 02:46:24 pm »
Partly the reason I work the way I do is because I run my business on a very long term basis, the buck stops with me totally, I have frequently needed to revisit projects 10 or more years later, and Cube is immensely complicated (to me, not to you experts who I think mostly do this in company time) so I try hard to use it in a simple way which is easy to set up and document.

I know one could set up two projects in Cube, one for say a boot block and the other for the rest, but when I was doing this kind of stuff I found really nasty cases of file references pointing to files to another project, etc. This was an absolute bastard to track down.

I suppose one could have two separate Cube installs but Cube doesn't like that. The 2nd one would probably want to be in a VM - which is exactly what I am doing now actually but for other reasons.

So those constructors are of no relevance to a C project?
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ejeffrey

  • Super Contributor
  • ***
  • Posts: 4090
  • Country: us
Re: A question on GCC .ld linker script syntax
« Reply #16 on: March 13, 2023, 03:05:00 pm »
Maybe that's because I am trying to load specific modules in specific places.

If you are just building a single monolithic project then the linkfile can be very simple.

The linkfile which builds the combined (boot+rest) project could have been monolithic but without a "boot block" section you don't get a brick recovery option.

When I did this the two linker scripts were both quite simple: mostly just the vendor link script with a different memory offset.  Instead of sorting the objects in the linker I just used two separate projects with their own list of source files.  The default KEEP options worked fine. I didn't need any special section attributes and the resulting hex files are disjoint and easily merged. 
 

Online ejeffrey

  • Super Contributor
  • ***
  • Posts: 4090
  • Country: us
Re: A question on GCC .ld linker script syntax
« Reply #17 on: March 13, 2023, 03:37:58 pm »

I know one could set up two projects in Cube, one for say a boot block and the other for the rest, but when I was doing this kind of stuff I found really nasty cases of file references pointing to files to another project, etc. This was an absolute bastard to track down.

It seems much less trouble than what you are doing.  I used two *completely* separate projects.  Different directories, different main.c, different configuration files, different everything exactly as if they were for two different microcontrollers entirely.  This also means I can reuse the same bootloader for different projects.

I do have a common directory with utility files and such shared by multiple projects.  Getting that set up right did take a bit of work, but it's something that I needed to do anyway to share code between projects.  Once that worked using it with the bootloader was easy.

 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #18 on: March 13, 2023, 04:03:07 pm »
I have a funny one which may ring a bell with someone.

In the vector table I have

.word     TIM6_DAC_IRQHandler               /* TIM6 and DAC1&2 underrun errors */

and under it I have

   .weak      TIM6_DAC_IRQHandler
   .thumb_set TIM6_DAC_IRQHandler,Default_Handler

This is standard ST code.

Then the real ISR is


/**
  * @brief This function handles TIM6 update interrupt.
  */
void TIM6_DAC_IRQHandler(void)
{
   TIM6_ISR();      //HAL_TIM_IRQHandler(&htim6);
}

and that is supposed to override the "weak" above.

But it isn't happening. The vector points to Default_Handler instead.

If I comment out the two weak & thumb_set lines, it all works.

There is something in the linker script which is failing to override the weak symbol.

Even explicitly linking in the .o file containing the ISR doesn't help.

Maybe I should just delete all the weak vectors to ISRs which do actually exist. After all, anything hitting the default handler won't do anything useful.

I have two projects, one is ok and the other one exhibits the above behaviour. Of course there are differences and I am working through them. The main one is that the working one has some code in the same .s file (in asm) but in a different section. The nonworking one is just a vector table. Both sections are pulled in by the linker script.

The Q is: what could prevent a weak symbol getting overriden, and it being used instead? Is there an order in which the two sets of symbols are to be linked?

The linker script for the vector module is

Code: [Select]

  .isr_vector2 :
  {
    . = ALIGN(0x200);
    KEEP(*(.isr_vector2.o))
    *isr_vector2.o (.text .text* .rodata .rodata*)
    . = ALIGN(4);
  } >FLASH_APP


and the vector file is

Code: [Select]


   .section  .isr_vector2,"a",%progbits
  .type  g_Vectors2, %object
  .size  g_Vectors2, .-g_Vectors2

  .global  g_Vectors2 /* does nothing */
  .global  Default_Handler /* does nothing */


/* The first two will never get used. This vector table is activated by setting VTOR to point to it
   and a reset will reset VTOR to 0.
   The _estack word should be redundant since SP is already set to top of RAM from b_main, but is
   required because the RTOS fetches it for its internal use.
   The 2nd word is unused because b_main jumps directly to main_stub() at base+32k with a jmp.

*/

g_Vectors2:
  .word  _estack /* in final image, SP is loaded directly before entering main_stub */
  .word  0 /* in final image, this is not used (reset resets VTOR, etc) */
  .word  NMI_Handler
  .word  HardFault_Handler
  .word  MemManage_Handler
  .word  BusFault_Handler
  .word  UsageFault_Handler
  .word  0
  .word  0
  .word  0
  .word  0
  .word  SVC_Handler
  .word  DebugMon_Handler
  .word  0
  .word  PendSV_Handler
  .word  SysTick_Handler

  /* External Interrupts */
  .word     WWDG_IRQHandler                   /* Window WatchDog              */
  .word     PVD_IRQHandler                    /* PVD through EXTI Line detection */
  .word     TAMP_STAMP_IRQHandler             /* Tamper and TimeStamps through the EXTI line */
  .word     RTC_WKUP_IRQHandler               /* RTC Wakeup through the EXTI line */
  .word     FLASH_IRQHandler                  /* FLASH                        */
  .word     RCC_IRQHandler                    /* RCC                          */
  .word     EXTI0_IRQHandler                  /* EXTI Line0                   */
  .word     EXTI1_IRQHandler                  /* EXTI Line1                   */
  .word     EXTI2_IRQHandler                  /* EXTI Line2                   */
  .word     EXTI3_IRQHandler                  /* EXTI Line3                   */
  .word     EXTI4_IRQHandler                  /* EXTI Line4                   */
  .word     DMA1_Stream0_IRQHandler           /* DMA1 Stream 0                */
  .word     DMA1_Stream1_IRQHandler           /* DMA1 Stream 1                */
  .word     DMA1_Stream2_IRQHandler           /* DMA1 Stream 2                */
  .word     DMA1_Stream3_IRQHandler           /* DMA1 Stream 3                */
  .word     DMA1_Stream4_IRQHandler           /* DMA1 Stream 4                */
  .word     DMA1_Stream5_IRQHandler           /* DMA1 Stream 5                */
  .word     DMA1_Stream6_IRQHandler           /* DMA1 Stream 6                */
  .word     ADC_IRQHandler                    /* ADC1, ADC2 and ADC3s         */
  .word     CAN1_TX_IRQHandler                /* CAN1 TX                      */
  .word     CAN1_RX0_IRQHandler               /* CAN1 RX0                     */
  .word     CAN1_RX1_IRQHandler               /* CAN1 RX1                     */
  .word     CAN1_SCE_IRQHandler               /* CAN1 SCE                     */
  .word     EXTI9_5_IRQHandler                /* External Line[9:5]s          */
  .word     TIM1_BRK_TIM9_IRQHandler          /* TIM1 Break and TIM9          */
  .word     TIM1_UP_TIM10_IRQHandler          /* TIM1 Update and TIM10        */
  .word     TIM1_TRG_COM_TIM11_IRQHandler     /* TIM1 Trigger and Commutation and TIM11 */
  .word     TIM1_CC_IRQHandler                /* TIM1 Capture Compare - used by wavegen */
  .word     TIM2_IRQHandler                   /* TIM2                         */
  .word     TIM3_IRQHandler                   /* TIM3                         */
  .word     TIM4_IRQHandler                   /* TIM4                         */
  .word     I2C1_EV_IRQHandler                /* I2C1 Event                   */
  .word     I2C1_ER_IRQHandler                /* I2C1 Error                   */
  .word     I2C2_EV_IRQHandler                /* I2C2 Event                   */
  .word     I2C2_ER_IRQHandler                /* I2C2 Error                   */
  .word     SPI1_IRQHandler                   /* SPI1                         */
  .word     SPI2_IRQHandler                   /* SPI2                         */
  .word     USART1_IRQHandler                 /* USART1                       */
  .word     USART2_IRQHandler                 /* USART2                       */
  .word     USART3_IRQHandler                 /* USART3                       */
  .word     EXTI15_10_IRQHandler              /* External Line[15:10]s        */
  .word     RTC_Alarm_IRQHandler              /* RTC Alarm (A and B) through EXTI Line */
  .word     OTG_FS_WKUP_IRQHandler            /* USB OTG FS Wakeup through EXTI line */
  .word     TIM8_BRK_TIM12_IRQHandler         /* TIM8 Break and TIM12         */
  .word     TIM8_UP_TIM13_IRQHandler          /* TIM8 Update and TIM13        */
  .word     TIM8_TRG_COM_TIM14_IRQHandler     /* TIM8 Trigger and Commutation and TIM14 */
  .word     TIM8_CC_IRQHandler                /* TIM8 Capture Compare         */
  .word     DMA1_Stream7_IRQHandler           /* DMA1 Stream7                 */
  .word     FSMC_IRQHandler                   /* FSMC                         */
  .word     SDIO_IRQHandler                   /* SDIO                         */
  .word     TIM5_IRQHandler                   /* TIM5                         */
  .word     SPI3_IRQHandler                   /* SPI3                         */
  .word     UART4_IRQHandler                  /* UART4                        */
  .word     UART5_IRQHandler                  /* UART5                        */
  .word     TIM6_DAC_IRQHandler               /* TIM6 and DAC1&2 underrun errors */
  .word     TIM7_IRQHandler                   /* TIM7                         */
  .word     DMA2_Stream0_IRQHandler           /* DMA2 Stream 0                */
  .word     DMA2_Stream1_IRQHandler           /* DMA2 Stream 1                */
  .word     DMA2_Stream2_IRQHandler           /* DMA2 Stream 2                */
  .word     DMA2_Stream3_IRQHandler           /* DMA2 Stream 3                */
  .word     DMA2_Stream4_IRQHandler           /* DMA2 Stream 4                */
  .word     ETH_IRQHandler                    /* Ethernet                     */
  .word     ETH_WKUP_IRQHandler               /* Ethernet Wakeup through EXTI line */
  .word     CAN2_TX_IRQHandler                /* CAN2 TX                      */
  .word     CAN2_RX0_IRQHandler               /* CAN2 RX0                     */
  .word     CAN2_RX1_IRQHandler               /* CAN2 RX1                     */
  .word     CAN2_SCE_IRQHandler               /* CAN2 SCE                     */
  .word     OTG_FS_IRQHandler                 /* USB OTG FS                   */
  .word     DMA2_Stream5_IRQHandler           /* DMA2 Stream 5                */
  .word     DMA2_Stream6_IRQHandler           /* DMA2 Stream 6                */
  .word     DMA2_Stream7_IRQHandler           /* DMA2 Stream 7                */
  .word     USART6_IRQHandler                 /* USART6                       */
  .word     I2C3_EV_IRQHandler                /* I2C3 event                   */
  .word     I2C3_ER_IRQHandler                /* I2C3 error                   */
  .word     OTG_HS_EP1_OUT_IRQHandler         /* USB OTG HS End Point 1 Out   */
  .word     OTG_HS_EP1_IN_IRQHandler          /* USB OTG HS End Point 1 In    */
  .word     OTG_HS_WKUP_IRQHandler            /* USB OTG HS Wakeup through EXTI */
  .word     OTG_HS_IRQHandler                 /* USB OTG HS                   */
  .word     DCMI_IRQHandler                   /* DCMI                         */
  .word     0                                 /* CRYP crypto                  */
  .word     HASH_RNG_IRQHandler               /* Hash and Rng                 */
  .word     FPU_IRQHandler                    /* FPU                          */

  .word     main   /* this merely prevents linker removing main() because nothing *calls* it */



/*******************************************************************************
*
* Provide weak aliases for each Exception handler to the Default_Handler.
* As they are weak aliases, any function with the same name will override
* this definition.
*
*******************************************************************************/



   .weak      NMI_Handler
   .thumb_set NMI_Handler,Default_Handler

   .weak      HardFault_Handler
   .thumb_set HardFault_Handler,Default_Handler

   .weak      MemManage_Handler
   .thumb_set MemManage_Handler,Default_Handler

   .weak      BusFault_Handler
   .thumb_set BusFault_Handler,Default_Handler

   .weak      UsageFault_Handler
   .thumb_set UsageFault_Handler,Default_Handler

   .weak      SVC_Handler
   .thumb_set SVC_Handler,Default_Handler

   .weak      DebugMon_Handler
   .thumb_set DebugMon_Handler,Default_Handler

   .weak      PendSV_Handler
   .thumb_set PendSV_Handler,Default_Handler

   .weak      SysTick_Handler
   .thumb_set SysTick_Handler,Default_Handler

   .weak      WWDG_IRQHandler
   .thumb_set WWDG_IRQHandler,Default_Handler

   .weak      PVD_IRQHandler
   .thumb_set PVD_IRQHandler,Default_Handler

   .weak      TAMP_STAMP_IRQHandler
   .thumb_set TAMP_STAMP_IRQHandler,Default_Handler

   .weak      RTC_WKUP_IRQHandler
   .thumb_set RTC_WKUP_IRQHandler,Default_Handler

   .weak      FLASH_IRQHandler
   .thumb_set FLASH_IRQHandler,Default_Handler

   .weak      RCC_IRQHandler
   .thumb_set RCC_IRQHandler,Default_Handler

   .weak      EXTI0_IRQHandler
   .thumb_set EXTI0_IRQHandler,Default_Handler

   .weak      EXTI1_IRQHandler
   .thumb_set EXTI1_IRQHandler,Default_Handler

   .weak      EXTI2_IRQHandler
   .thumb_set EXTI2_IRQHandler,Default_Handler

   .weak      EXTI3_IRQHandler
   .thumb_set EXTI3_IRQHandler,Default_Handler

   .weak      EXTI4_IRQHandler
   .thumb_set EXTI4_IRQHandler,Default_Handler

   .weak      DMA1_Stream0_IRQHandler
   .thumb_set DMA1_Stream0_IRQHandler,Default_Handler

   .weak      DMA1_Stream1_IRQHandler
   .thumb_set DMA1_Stream1_IRQHandler,Default_Handler

   .weak      DMA1_Stream2_IRQHandler
   .thumb_set DMA1_Stream2_IRQHandler,Default_Handler

   .weak      DMA1_Stream3_IRQHandler
   .thumb_set DMA1_Stream3_IRQHandler,Default_Handler

   .weak      DMA1_Stream4_IRQHandler
   .thumb_set DMA1_Stream4_IRQHandler,Default_Handler

   .weak      DMA1_Stream5_IRQHandler
   .thumb_set DMA1_Stream5_IRQHandler,Default_Handler

   .weak      DMA1_Stream6_IRQHandler
   .thumb_set DMA1_Stream6_IRQHandler,Default_Handler

   .weak      ADC_IRQHandler
   .thumb_set ADC_IRQHandler,Default_Handler

   .weak      CAN1_TX_IRQHandler
   .thumb_set CAN1_TX_IRQHandler,Default_Handler

   .weak      CAN1_RX0_IRQHandler
   .thumb_set CAN1_RX0_IRQHandler,Default_Handler

   .weak      CAN1_RX1_IRQHandler
   .thumb_set CAN1_RX1_IRQHandler,Default_Handler

   .weak      CAN1_SCE_IRQHandler
   .thumb_set CAN1_SCE_IRQHandler,Default_Handler

   .weak      EXTI9_5_IRQHandler
   .thumb_set EXTI9_5_IRQHandler,Default_Handler

   .weak      TIM1_BRK_TIM9_IRQHandler
   .thumb_set TIM1_BRK_TIM9_IRQHandler,Default_Handler

   .weak      TIM1_UP_TIM10_IRQHandler
   .thumb_set TIM1_UP_TIM10_IRQHandler,Default_Handler

   .weak      TIM1_TRG_COM_TIM11_IRQHandler
   .thumb_set TIM1_TRG_COM_TIM11_IRQHandler,Default_Handler

   .weak      TIM1_CC_IRQHandler
   .thumb_set TIM1_CC_IRQHandler,Default_Handler

   .weak      TIM2_IRQHandler
   .thumb_set TIM2_IRQHandler,Default_Handler

   .weak      TIM3_IRQHandler
   .thumb_set TIM3_IRQHandler,Default_Handler

   .weak      TIM4_IRQHandler
   .thumb_set TIM4_IRQHandler,Default_Handler

   .weak      I2C1_EV_IRQHandler
   .thumb_set I2C1_EV_IRQHandler,Default_Handler

   .weak      I2C1_ER_IRQHandler
   .thumb_set I2C1_ER_IRQHandler,Default_Handler

   .weak      I2C2_EV_IRQHandler
   .thumb_set I2C2_EV_IRQHandler,Default_Handler

   .weak      I2C2_ER_IRQHandler
   .thumb_set I2C2_ER_IRQHandler,Default_Handler

   .weak      SPI1_IRQHandler
   .thumb_set SPI1_IRQHandler,Default_Handler

   .weak      SPI2_IRQHandler
   .thumb_set SPI2_IRQHandler,Default_Handler

   .weak      USART1_IRQHandler
   .thumb_set USART1_IRQHandler,Default_Handler

   .weak      USART2_IRQHandler
   .thumb_set USART2_IRQHandler,Default_Handler

   .weak      USART3_IRQHandler
   .thumb_set USART3_IRQHandler,Default_Handler

   .weak      EXTI15_10_IRQHandler
   .thumb_set EXTI15_10_IRQHandler,Default_Handler

   .weak      RTC_Alarm_IRQHandler
   .thumb_set RTC_Alarm_IRQHandler,Default_Handler

   .weak      OTG_FS_WKUP_IRQHandler
   .thumb_set OTG_FS_WKUP_IRQHandler,Default_Handler

   .weak      TIM8_BRK_TIM12_IRQHandler
   .thumb_set TIM8_BRK_TIM12_IRQHandler,Default_Handler

   .weak      TIM8_UP_TIM13_IRQHandler
   .thumb_set TIM8_UP_TIM13_IRQHandler,Default_Handler

   .weak      TIM8_TRG_COM_TIM14_IRQHandler
   .thumb_set TIM8_TRG_COM_TIM14_IRQHandler,Default_Handler

   .weak      TIM8_CC_IRQHandler
   .thumb_set TIM8_CC_IRQHandler,Default_Handler

   .weak      DMA1_Stream7_IRQHandler
   .thumb_set DMA1_Stream7_IRQHandler,Default_Handler

   .weak      FSMC_IRQHandler
   .thumb_set FSMC_IRQHandler,Default_Handler

   .weak      SDIO_IRQHandler
   .thumb_set SDIO_IRQHandler,Default_Handler

   .weak      TIM5_IRQHandler
   .thumb_set TIM5_IRQHandler,Default_Handler

   .weak      SPI3_IRQHandler
   .thumb_set SPI3_IRQHandler,Default_Handler

   .weak      UART4_IRQHandler
   .thumb_set UART4_IRQHandler,Default_Handler

   .weak      UART5_IRQHandler
   .thumb_set UART5_IRQHandler,Default_Handler

   .weak      TIM6_DAC_IRQHandler
   .thumb_set TIM6_DAC_IRQHandler,Default_Handler


   .weak      TIM7_IRQHandler
   .thumb_set TIM7_IRQHandler,Default_Handler

   .weak      DMA2_Stream0_IRQHandler
   .thumb_set DMA2_Stream0_IRQHandler,Default_Handler

   .weak      DMA2_Stream1_IRQHandler
   .thumb_set DMA2_Stream1_IRQHandler,Default_Handler

   .weak      DMA2_Stream2_IRQHandler
   .thumb_set DMA2_Stream2_IRQHandler,Default_Handler

   .weak      DMA2_Stream3_IRQHandler
   .thumb_set DMA2_Stream3_IRQHandler,Default_Handler

   .weak      DMA2_Stream4_IRQHandler
   .thumb_set DMA2_Stream4_IRQHandler,Default_Handler

   .weak      ETH_IRQHandler
   .thumb_set ETH_IRQHandler,Default_Handler

   .weak      ETH_WKUP_IRQHandler
   .thumb_set ETH_WKUP_IRQHandler,Default_Handler

   .weak      CAN2_TX_IRQHandler
   .thumb_set CAN2_TX_IRQHandler,Default_Handler

   .weak      CAN2_RX0_IRQHandler
   .thumb_set CAN2_RX0_IRQHandler,Default_Handler

   .weak      CAN2_RX1_IRQHandler
   .thumb_set CAN2_RX1_IRQHandler,Default_Handler

   .weak      CAN2_SCE_IRQHandler
   .thumb_set CAN2_SCE_IRQHandler,Default_Handler

   .weak      OTG_FS_IRQHandler
   .thumb_set OTG_FS_IRQHandler,Default_Handler

   .weak      DMA2_Stream5_IRQHandler
   .thumb_set DMA2_Stream5_IRQHandler,Default_Handler

   .weak      DMA2_Stream6_IRQHandler
   .thumb_set DMA2_Stream6_IRQHandler,Default_Handler

   .weak      DMA2_Stream7_IRQHandler
   .thumb_set DMA2_Stream7_IRQHandler,Default_Handler

   .weak      USART6_IRQHandler
   .thumb_set USART6_IRQHandler,Default_Handler

   .weak      I2C3_EV_IRQHandler
   .thumb_set I2C3_EV_IRQHandler,Default_Handler

   .weak      I2C3_ER_IRQHandler
   .thumb_set I2C3_ER_IRQHandler,Default_Handler

   .weak      OTG_HS_EP1_OUT_IRQHandler
   .thumb_set OTG_HS_EP1_OUT_IRQHandler,Default_Handler

   .weak      OTG_HS_EP1_IN_IRQHandler
   .thumb_set OTG_HS_EP1_IN_IRQHandler,Default_Handler

   .weak      OTG_HS_WKUP_IRQHandler
   .thumb_set OTG_HS_WKUP_IRQHandler,Default_Handler

   .weak      OTG_HS_IRQHandler
   .thumb_set OTG_HS_IRQHandler,Default_Handler

   .weak      DCMI_IRQHandler
   .thumb_set DCMI_IRQHandler,Default_Handler

   .weak      HASH_RNG_IRQHandler
   .thumb_set HASH_RNG_IRQHandler,Default_Handler

   .weak      FPU_IRQHandler
   .thumb_set FPU_IRQHandler,Default_Handler


/**
 * @brief  This is the code that gets called when the processor receives an
 *         unexpected interrupt.  This simply enters an infinite loop, preserving
 *         the system state for examination by a debugger.
 * @param  None
 * @retval None
*/


       .section  .text.Default_Handler,"ax",%progbits
Default_Handler:
Infinite_Loop:
  b  Infinite_Loop
  .size  Default_Handler, .-Default_Handler


The linkfile section which is supposed to collect all other code is

Code: [Select]

/* This collects all other stuff, which gets loaded into FLASH after KDE_main.o above */
  .code_constants_etc :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)


    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _e_code_constants_etc = .;        /* define a global symbol at end of code */
} >FLASH_APP


and this does work, but fails to override the weak symbols earlier.

Another clue is that out of ~500k of code I have only a few k left. So the linker is removing a load of stuff. I've had this before. I had a main() to which I was jumping with a non-obvious asm jump so it was never called from C. I solved that one with .word main at the end of the vector table (an unused location).

Maybe there is something extra which needs to be done to get the entry point to be 0x8008000 and not 0x8000000. However I could not get the ENTRY linkfile bit to work. ST GCC doesn't support the various formats online e.g. ENTRY(0x8008000). It needs to be a symbol but only a special class of symbols, not just any old label that follows.
« Last Edit: March 13, 2023, 07:11:05 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline eutectique

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: be
Re: A question on GCC .ld linker script syntax
« Reply #19 on: March 13, 2023, 10:23:29 pm »
Maybe there is something extra which needs to be done to get the entry point to be 0x8008000 and not 0x8000000. However I could not get the ENTRY linkfile bit to work. ST GCC doesn't support the various formats online e.g. ENTRY(0x8008000). It needs to be a symbol but only a special class of symbols, not just any old label that follows.

ENTRY is needed for the ELF loader -- literally, the piece of code that loads ELF file into RAM and jumps to the symbol specified in ENTRY. For your needs it is useless.

Cortex-M CPU, when it gets out of reset, reads PC from the second word of flash, at address 0x8000000 in your case. You can not bend the CPU to jump anywhere else, no matter how hard you try.

Edit: first word is SP, of course.
« Last Edit: March 13, 2023, 10:30:26 pm by eutectique »
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #20 on: March 13, 2023, 10:29:03 pm »
Just so you all can have a laugh, spot the difference between these two. The 1st one is right

Code: [Select]
  .isr_vector2 :
  {
    . = ALIGN(0x200);
    KEEP(*(.isr_vector2))
    *isr_vector2.o (.text .text* .rodata .rodata*)
    . = ALIGN(4);
  } >FLASH_APP

  .isr_vector2 :
  {
    . = ALIGN(0x200);
    KEEP(*(.isr_vector2.o))
    *isr_vector2.o (.text .text* .rodata .rodata*)
    . = ALIGN(4);
  } >FLASH_APP

There is a load of examples out there where they use the KEEP(*(.modulename.o)). With the .o. You know why it works? Because 99% of the time there is nothing to KEEP. The f-ing stupid linkfile syntax has almost no warnings or error checking. It works because unless you are writing complete crap, there isn't any unreachable code that you need. But this breaks with vector tables and such, because the stuff isn't pointed to by anything. So the linker strips it.

And the ".isr_vector2 :" label at the top is just pure BS. It could be "fred". So why do people write linkfiles with this ambiguity all over the place? Like ".text.isr_vector2 :" which is even more BS; the leading "text" is not related to the TEXT segment.

I am still working on the weak symbols not getting overriden...

Quote
ENTRY is needed for the ELF loader -- literally, the piece of code that loads ELF file into RAM and jumps to the symbol specified in ENTRY. For your needs it is useless.

Thanks :)

Quote
Cortex-M CPU, when it gets out of reset, reads PC from the first word of flash, at address 0x8000000 in your case.

The 2nd word actually. The 1st word goes into SP.

The 0x8008000 stuff is an "overlay" which is entered (from fully set up and running code, obviously) with a jmp. So no reset is involved. I also load SP directly before the jmp. That part works fine.

I agree there would be no way to enter an overlay at 0x8008000 via a reset.
« Last Edit: March 13, 2023, 10:31:35 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline eutectique

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: be
Re: A question on GCC .ld linker script syntax
« Reply #21 on: March 13, 2023, 10:43:33 pm »
Quote
Cortex-M CPU, when it gets out of reset, reads PC from the first word of flash, at address 0x8000000 in your case.

The 2nd word actually. The 1st word goes into SP.

Yes, silly me, corrected the reply 3 seconds after hitting send.


Quote from: peter-h
The 0x8008000 stuff is an "overlay" which is entered (from fully set up and running code, obviously) with a jmp. So no reset is involved. I also load SP directly before the jmp. That part works fine.

The usual way of doing this is

Code: [Select]
void jump_to(uint32_t pc, uint32_t sp) __attribute__((naked)) {
    __asm volatile("  msr msp, r1    \n"
                   "  bx  r0         \n");
}

I am sure you are using something similar. Don't forget to set pc to odd value.
 
The following users thanked this post: peter-h

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 879
Re: A question on GCC .ld linker script syntax
« Reply #22 on: March 13, 2023, 10:52:13 pm »
Quote
And the ".isr_vector2 :" label at the top is just pure BS. It could be "fred".
When you are looking at your section info from something like objdump (size, vma, lma, etc.), it is probably not going to be very helpful if your section names are named after people.
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #23 on: March 13, 2023, 10:56:24 pm »
Yes:

Code: [Select]

extern char code_base;         // This is base+32k
uint32_t jmpaddr = (uint32_t)&code_base;                 // Weird stuff to get a usable value
jmpaddr |= 1; // Bit 0 = 1 for thumb code
asm("bx %0"::"r" (jmpaddr)); // jmp (bx) etc
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 879
Re: A question on GCC .ld linker script syntax
« Reply #24 on: March 13, 2023, 11:26:13 pm »
No assembly required-
https://godbolt.org/z/PsnGff93o
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 560
  • Country: sk
Re: A question on GCC .ld linker script syntax
« Reply #25 on: March 13, 2023, 11:42:52 pm »
Quote
weak...

Somebody said above that library (.a) is only a collection of .o. It is, but the linker does not treat it as such. There are several subtle but important differences, one being that weak symbols from .o are not overriden by symbols from .a.

https://stackoverflow.com/questions/36994557/gnu-ld-weak-declaration-overriding-strong-declaration and links from there.

JW
 
The following users thanked this post: peter-h, newbrain

Offline eutectique

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: be
Re: A question on GCC .ld linker script syntax
« Reply #26 on: March 13, 2023, 11:55:41 pm »
No assembly required-
https://godbolt.org/z/PsnGff93o

In principle, yes, nice trick. Practically, when jumping from bl into the app, it would be good to place memory barriers, invalidate cache (if present), and set up the stack pointer.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 879
Re: A question on GCC .ld linker script syntax
« Reply #27 on: March 14, 2023, 12:33:07 am »
Quote
it would be good to place memory barriers, invalidate cache (if present), and set up the stack pointer.
Then I guess add some code from the cmsis headers before the goto- __set_MSP, __DMB, etc.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #28 on: March 14, 2023, 06:47:17 am »
Yes - this worked also. The "extern" is the key hack

   extern int main();
   main();

This is just a way to show that you are a C expert ;)

                extern uint32_t code_base;
                goto *(&code_base);

I really have no idea what line 2 does. & takes the address of, but why the * ? I've never used function pointers in C. Only in asm. Anyway, can't beat asm; it does exactly what it says on the tin :)

Quote
it would be good to place memory barriers, invalidate cache (if present), and set up the stack pointer.

I do this just before the jmp:

   // Set SP to top of RAM
   asm volatile ("ldr sp, = _estack \n");

Quote
Then I guess add some code from the cmsis headers before the goto- __set_MSP, __DMB, etc.

Why, when this is just one load of code transferring control to another load of code? And it happens before interrupts are enabled, before RTOS starts...

Quote
Somebody said above that library (.a) is only a collection of .o. It is, but the linker does not treat it as such. There are several subtle but important differences, one being that weak symbols from .o are not overriden by symbols from .a.

That's staggering.

But I don't think it is my problem right now. I have weak symbols in a .s file and am trying to override them with a .c (compiled to .o) file. I continue to work on this one. It works in another version of the project and has done so for years, so is somehow very narrowly context dependent.

Still amazed over those KEEP directives. Mostly they are unnecessary BS which people just throw everywhere.

Same with the __libc_init_array stuff apparently. Somebody put it there and everybody just left it. All I see is this in the .map file

Code: [Select]

.preinit_array  0x000000000807095c        0x0
                0x000000000807095c                PROVIDE (__preinit_array_start = .)
 *(.preinit_array*)
                0x000000000807095c                PROVIDE (__preinit_array_end = .)

.init_array     0x000000000807095c        0x8
                0x000000000807095c                PROVIDE (__init_array_start = .)
 *(SORT_BY_NAME(.init_array.*))
 .init_array.00000
                0x000000000807095c        0x4 ../LIBC/LIBCW\libc-weakened.a(lib_a-__call_atexit.o)
 *(.init_array*)
 .init_array    0x0000000008070960        0x4 c:/st/stm32cubeide_1.11.0/stm32cubeide/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.win32_1.0.200.202301161003/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/thumb/v7e-m+fp/hard/crtbegin.o
                0x0000000008070964                PROVIDE (__init_array_end = .)

.fini_array     0x0000000008070964        0x4
                0x0000000008070964                PROVIDE (__fini_array_start = .)
 *(.fini_array*)
 .fini_array    0x0000000008070964        0x4 c:/st/stm32cubeide_1.11.0/stm32cubeide/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.win32_1.0.200.202301161003/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/thumb/v7e-m+fp/hard/crtbegin.o
 *(SORT_BY_NAME(.fini_array.*))
                0x0000000008070968                PROVIDE (__fini_array_end = .)


« Last Edit: March 14, 2023, 07:09:17 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #29 on: March 14, 2023, 07:01:09 am »
Still amazed over those KEEP directives. Mostly they are unnecessary BS which people just throw everywhere.
They only matter when link-time garbage collection is used, i.e. --gc-sections is used.

For library-style code, -ffunction-sections -fdata-sections at compile time (CFLAGS), and -Wl,--gc-sections at link time (LDFLAGS using gcc to link) does omit unneeded functions and data (and is also what generates all those .textsuffix and .datasuffix sections), so that only actually used code and data will be linked to the final binary.  You have to admit it is very useful in some use cases.

It has to be done this way, because ELF linkers operate on sections, and not at the individual symbol level.  (This is what reminds me so much of awk, if we compare ELF sections to awk records.  I can certainly understand why the original implementors wrote it this way, even if I myself would now do it differently.  It does significantly reduce the annoyance factor for myself.)
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #30 on: March 14, 2023, 07:11:37 am »
Indeed

Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #31 on: March 14, 2023, 07:19:04 am »
I have now realised that wek is exactly right.

The vectors file compiles to a .o but the real ISRs are in a library .a.

Bloody hell.

The simplest way is to get rid of the weaks for ISRs that actually exist. This is all inherited code; I never write stuff like this. Let's face it - the code will not run unless the real function is provided, so all this gives you is a project that compiles and then bombs.
« Last Edit: March 14, 2023, 08:06:28 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9653
  • Country: fi
Re: A question on GCC .ld linker script syntax
« Reply #32 on: March 14, 2023, 10:38:32 am »

                extern uint32_t code_base;
                goto *(&code_base);

I really have no idea what line 2 does. & takes the address of, but why the * ? I've never used function pointers in C. Only in asm.

This is not function pointer. This is not even C language. It's a GCC extension, a so-called computed goto.

If you googled "C function pointer", it would be instantly obvious that it does not involve using goto.
 
The following users thanked this post: newbrain

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #33 on: March 14, 2023, 10:43:40 am »
Yup.  It's also related to the GCC label address extension: &&label is a void pointer you can use with goto (for example, in an expression that chooses between multiple ones), as well as an initializer (either after, or within the same block the label label has been declared).

You can even implement a Duff's Device without a switch() this way; just define an array of void pointers to labels, and then you can do a computed goto, indexing the array.
« Last Edit: March 14, 2023, 10:45:17 am by Nominal Animal »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #34 on: March 14, 2023, 11:48:43 am »
Quote
If you googled "C function pointer", it would be instantly obvious that it does not involve using goto.

You rarely waste an opportunity to tell me that I am intellectually inferior to you :) I don't mind at all, so long as I am learning, but in this case I learnt nothing (I know a goto is not a function pointer) so you need to up your game :) I use a goto occassionally; it is handy for error handling. So I could see this is a form of computed goto. Anyway, asm is best; always works.

Anyway, following wek's tip I have it all running now :) I would have never found that .o - .a conflict, but I just implemented what I was going to do anyway once I found the weaks were not working: delete the weak defs. I also replaced all unused vectors with .word 0:

Code: [Select]
.word  0 /* TIM4_IRQHandler                    TIM4                         */
  .word  0 /* I2C1_EV_IRQHandler                 I2C1 Event                   */
  .word  0 /* I2C1_ER_IRQHandler                 I2C1 Error                   */
  .word  0 /* I2C2_EV_IRQHandler                 I2C2 Event                   */
  .word  0 /* I2C2_ER_IRQHandler                 I2C2 Error                   */
  .word  0 /* SPI1_IRQHandler                    SPI1                         */
  .word  0 /* SPI2_IRQHandler                    SPI2                         */
  .word  USART1_IRQHandler                  /* USART1                       */
  .word  USART2_IRQHandler                  /* USART2                       */
  .word  USART3_IRQHandler                  /* USART3                       */
  .word  0 /* EXTI15_10_IRQHandler               External Line[15:10]s        */
  .word  0 /* RTC_Alarm_IRQHandler               RTC Alarm (A and B) through EXTI Line */
  .word  0 /* OTG_FS_WKUP_IRQHandler             USB OTG FS Wakeup through EXTI line */

so any interrupt enabled but not handled will properly crash the system, but that would happen equally with the weak symbols.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline eutectique

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: be
Re: A question on GCC .ld linker script syntax
« Reply #35 on: March 14, 2023, 12:16:54 pm »
Quote
it would be good to place memory barriers, invalidate cache (if present), and set up the stack pointer.
Then I guess add some code from the cmsis headers before the goto- __set_MSP, __DMB, etc.

Which, in case of gcc, boils down to __ASM volatile ("dmb 0xF":::"memory");

https://github.com/ARM-software/CMSIS_5/blob/develop/CMSIS/Core/Include/cmsis_gcc.h#L286-L289
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9653
  • Country: fi
Re: A question on GCC .ld linker script syntax
« Reply #36 on: March 14, 2023, 01:16:07 pm »
Quote
If you googled "C function pointer", it would be instantly obvious that it does not involve using goto.
You rarely waste an opportunity to tell me that I am intellectually inferior to you

I nearly forgot this time - I had to edit it in!

But I have to say again, there's absolutely nothing wrong with your intelligence. I'm just wondering you have a lot of what we would call sisu here in Finland, basically persistence "to go through a grey rock", another saying I like here. We appreciate that mindset here, doing something the hard way and learn in the process.

But I mean, function pointer syntax can look daunting sometimes, which is why you should just google, without trying to remember all that stuff. I Google "C function pointer" almost every time I need to use it. Not intelligent enough to automatically remember.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #37 on: March 14, 2023, 01:29:50 pm »
Quote
Which, in case of gcc, boils down to __ASM volatile ("dmb 0xF":::"memory");

Why would that be needed?

I have seen this stuff in ETH code but it turned out be not required on a 32F417 which has no data cache.

Quote
doing something the hard way and learn in the process.

I do things in a way which I can understand, and which I know I can document (most coders document exactly nothing) and support when a customer has a problem. I've never worked with anyone who had to worry about the last two.
« Last Edit: March 14, 2023, 02:06:59 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 851
  • Country: es
Re: A question on GCC .ld linker script syntax
« Reply #38 on: March 14, 2023, 02:48:31 pm »
Why invalidating cache/DMBing before a normal flash-to-flash jump? There should be no difference from any local jump/call in this case (unlike i.e. jumping to code freshly loaded to RAM when cache can contain stale data (reason to invalidate) or some reordered RAM store isn't completed yet (reason to DMB)).
There is possible problem of slightly different kind: setting the MSP says goodbye to all stack-based local vars, so it would be better to place it in the same __asm() with the final jump to be sure that no stack var content was used i.e. as a jump target.

peter-h, sorry for my exUSSR straightness, but there is a difference between "I've tried to read the manual on subj A but parts B and C are not clear to me" and "I'm too busy to read first few lines of the first google result" or "the manual has 1000 pages, let someone other read it for me".
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #39 on: March 14, 2023, 03:39:44 pm »
Quote
Why invalidating cache/DMBing before a normal flash-to-flash jump? There should be no difference from any local jump/call in this case (unlike i.e. jumping to code freshly loaded to RAM when cache can contain stale data (reason to invalidate) or some reordered RAM store isn't completed yet (reason to DMB)).
There is possible problem of slightly different kind: setting the MSP says goodbye to all stack-based local vars, so it would be better to place it in the same __asm() with the final jump to be sure that no stack var content was used i.e. as a jump target.

Sure. That jmp is however always flash to flash, and AFAICT the assembler produced by GCC contains loads of jumps. Well, on a quick look I can't see any right now...

At the jmp point I am chucking away all stack based variables. There should be none anyway. A bit dirty but I can't see a problem; this is a one-way path. Nothing is returning.

Quote
sorry for my exUSSR straightness, but there is a difference between "I've tried to read the manual on subj A but parts B and C are not clear to me" and "I'm too busy to read first few lines of the first google result" or "the manual has 1000 pages, let someone other read it for me".

I don't think I have ever said that. I have often commented on the 2000 page RM which, even if you find the right section, is incredibly obtuse. It is more like an encyclopedia. I think asking a specific question is ok. This is not like the ST forum where 90% of the posts are from somebody who bought a ST devkit and says "HELP - the LED is not flashing".

I do spend many hours googling (that's how most code is written today) but much of the material is simply wrong, impossible for me to understand, doesn't work, and most hits lead to unsolved problems.

Also note that I post the finished solutions. Almost nobody does that, because most people are coding in their employer's time and won't post any working code. I think that's one of the biggest problems in this area. It is also why there is so much crap on github.

I am from the Iron Curtain area too :)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 879
Re: A question on GCC .ld linker script syntax
« Reply #40 on: March 14, 2023, 03:53:20 pm »
Quote
Which, in case of gcc, boils down to __ASM volatile ("dmb 0xF":::"memory");
Much code in the cmsis header is inline asm, just pointing out that it may already have been done for you with no need to do it yourself. I think in most cases I would rather use a version from cmsis rather than create my own which does the same thing.

https://godbolt.org/z/5We9d8aqP
Don't even need to make sure bit0 is set, as the goto will do that for you. The attributes are not necessary, but the noinline may make one feel better that this code will stay together as a group.
« Last Edit: March 14, 2023, 04:16:48 pm by cv007 »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #41 on: March 14, 2023, 04:22:13 pm »
What is the difference between these two

Code: [Select]
MSR msp, r3

asm volatile ("ldr sp, = _estack \n");

Remember this is before ints are enabled and before RTOS is started.

I am trying to stick to the simplest way to do stuff.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 879
Re: A question on GCC .ld linker script syntax
« Reply #42 on: March 14, 2023, 04:56:29 pm »
Quote
I am trying to stick to the simplest way to do stuff.
Then maybe just use the cmsis headers to do some of these things. Loading sp like you did is not possible for the M0 for example (ldr restrictions), and if it is an ok instruction (in your case) you still are not choosing which stack pointer is being set. They provided instructions to handle these type of things, and already have higher level inline functions you can use such as __set_MSP. I doubt one could come up with something simpler.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #43 on: March 14, 2023, 05:15:21 pm »
On M4, what is called "sp" is the MSP (Main Stack Pointer) which is used during startup and during main(). Interrupts are also using this stack.

PSP (Process Stack Pointer) is used by the process/tasks, only once the RTOS is started.

For example here https://vivonomicon.com/2018/04/20/bare-metal-stm32-programming-part-2-making-it-to-main/

Code: [Select]
LDR  r0, =_estack
MOV  sp, r0

and STM themselves, in their 32F407/417 startupxxx.s, use

Code: [Select]
ldr   sp, =_estack
Looking at the cmsis macros, they still do just the one thing:

Code: [Select]

/**
  \brief   Set Main Stack Pointer
  \details Assigns the given value to the Main Stack Pointer (MSP).
  \param [in]    topOfMainStack  Main Stack Pointer value to set
 */
__STATIC_FORCEINLINE void __set_MSP(uint32_t topOfMainStack)
{
__ASM volatile ("MSR msp, %0" : : "r" (topOfMainStack) : );
}
« Last Edit: March 14, 2023, 05:17:29 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 879
Re: A question on GCC .ld linker script syntax
« Reply #44 on: March 14, 2023, 06:13:40 pm »
Quote
For example here
They are using a M0+ so cannot directly load sp and need to go through an r0-r7 register. If you have an M4 you can ldr sp as you have shown as there is no r0-r7 restriction. Better than either it seems to me is to just use the cmsis function and be done. No need to figure out what mcu you have, what stack pointer is the target of your value, what registers to use if needed, inline asm syntax, plus it will work the same when you switch to some other cortex-m. What is the downside of using the provided inline function?
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #45 on: March 14, 2023, 06:43:05 pm »
I am more concerned if I am doing something wrong, or not doing something I should be doing.

Does a "long jump" have some special requirements?

I agree re generic code but the end result is still a specific CPU, with a ton of software which is totally non generic. For example the GPIO pin to peripheral mapping is totally specific to the project. It is possible to use a 32F437 instead of a 32F417 and change nothing (well, a couple of tiny details like internal measurement of the RTC battery differ). Or a 407 if you rebuild to use software crypto. But in general a product is specific in so many ways that if you changed the CPU you would need to go over every single line of source.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 879
Re: A question on GCC .ld linker script syntax
« Reply #46 on: March 14, 2023, 10:14:47 pm »
Quote
I agree re generic code but the end result is still a specific CPU, with a ton of software which is totally non generic.
Of course, but creating your own inline asm to set the stack, in whatever manner you choose, just because the cmsis version works for all cortex-m and all compilers doesn't make sense.

Quote
Does a "long jump" have some special requirements?
Its a 32bit mcu with 32bit registers and a 32bit address space, so define 'long'.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #47 on: March 14, 2023, 11:04:24 pm »
Quote
so define 'long'.

Do you recognise the long call attrib?

Code: [Select]

// === At this point, interrupts and DMA must still be disabled ====
// Execute loader. Reboots afterwards.

extern char _loader_start;
extern char _loader_end;
extern char _loader_loadaddr;
extern char _loader_bss_start;
extern char _loader_bss_end;

// Copy loader code to RAM. This also sets up its initialised data.
B_memcpy(&_loader_start, &_loader_loadaddr, &_loader_end - &_loader_start);

// Clear loader's BSS and COMMON
B_memset(&_loader_bss_start, 0, &_loader_bss_end - &_loader_bss_start);

// See comments in loader.c for why the long call.

extern void loader_entry() __attribute__((long_call));
loader_entry();

// never get here (loader always reboots)
for (;;);

The comment referred to is

 * ALSO ANY FUNCTIONS CALLED FROM THE BOOT LOADER, AND WHICH ARE NOT IN LOADER.C (WHICH ARE
 * IN THE BOOT BLOCK) NEED THE LONG CALL ATTRIBUTE IN THEIR PROTOTYPE IN B_MAIN.H. It is not clear
 * why this is required and it may just be that debugging works properly. Otherwise, for long calls
 * or jumps the linker adds "veneer code" which needs switching to asm code mode to step through.

There was a thread on the veneer code...

I still have a bit of trouble which I suspect is due to BSS not getting zeroed, or somewhich should be zeroed not getting loaded into BSS. I checked the BSS area for being zeroed by looking at memory, when the memset func is running...

In this code

Code: [Select]

void KDE_get_date_time(char *datetime)
{

RTC_TimeTypeDef sTime = {0};
RTC_DateTypeDef sDate = {0};

HAL_RTC_GetTime(&hrtc, &sTime, RTC_FORMAT_BIN);
HAL_RTC_GetDate(&hrtc, &sDate, RTC_FORMAT_BIN);

int i = (sDate.WeekDay - 1) * 3;

sprintf(datetime, "%c%c%c %02d%02d20%02d %02d%02d%02d DOY=%03d",
d_o_w[i],d_o_w[i+1],d_o_w[i+2],sDate.Date, sDate.Month, sDate.Year,
sTime.Hours, sTime.Minutes, sTime.Seconds, day_of_year(sDate.Date, sDate.Month-1, sDate.Year+100));

}

how would you expect those two structs to be zeroed? I would think that if they were static they would be in BSS. But I think there is special code at the start of that function, which might be a string in BSS which gets copied, or perhaps a "memset"? The sprintf() runs for ever. OK; I know it should be an snprintf() but that isn't my code, and it isn't supposed to go off the end.

Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 879
Re: A question on GCC .ld linker script syntax
« Reply #48 on: March 14, 2023, 11:49:02 pm »
Quote
NEED THE LONG CALL ATTRIBUTE IN THEIR PROTOTYPE
I don't get that advanced, but I assume there can be cases where the compiler emits a BL instruction where the address cannot be reached (BL limitation, address not known to the compiler at compile time, the linker will fill in, and is out of BL range). The long_call attribute I presume will make the compiler end up using BLX instead whether required or not. Or something.

edit- I guess those that use ram functions will know all about this. It would appear the gcc compiler treats all symbols as BL reachable, so the linker is required to fix up things like making executable code located in data/ram reachable by creating extra code to bridge from BL reachable to BL unreachable.

You can try out various things in the online compiler. It makes for a good playground.
https://godbolt.org/z/sx6K75TGq
(added the goto for fun- the compiler makes no assumptions about the target address when using, obviously not useful when passing arguments, or when you need to return)

Quote
how would you expect those two structs to be zeroed?
Like any other local var. The compiler may put it on the stack and zero it out, or it may optimize the code where there is no need to use the stack or to zero it out. A local non-static var is not using bss in any case.
« Last Edit: March 15, 2023, 06:09:45 am by cv007 »
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9653
  • Country: fi
Re: A question on GCC .ld linker script syntax
« Reply #49 on: March 15, 2023, 06:06:19 am »
Code: [Select]
sprintf(datetime, "%c%c%c %02d%02d20%02d %02d%02d%02d DOY=%03d",
d_o_w[i],d_o_w[i+1],d_o_w[i+2],sDate.Date, sDate.Month, sDate.Year,
sTime.Hours, sTime.Minutes, sTime.Seconds, day_of_year(sDate.Date, sDate.Month-1, sDate.Year+100));

}

That code does not use %s in the format string. IMO, there is nothing in this code which is looking for a terminating zero. It should not "run forever". Initial state of datetime is irrelevant, as are all the input values here; %c or %d cannot "crash", there are no invalid values for them.

Something else is going on; maybe look with a debugger where the code is actually stuck. Maybe you are in a hardfault handler or something.
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #50 on: March 15, 2023, 07:54:45 am »
The sprintf was looping - I was stepping through it. I will look more carefully when I get back to it.

This code has been running for years. What has changed is the linkfile, because I am now running this as a "user written overlay". I am suspecting uninitialised variables, and therefore one of these

Code: [Select]
/* This stuff is copied from FLASH to RAM by C code in the main stub */
.all_nonboot_data :
  {
    . = ALIGN(4);
    _s_nonboot_data = .;        /* create a global symbol at data start */
    *(.data .data*)      /* .data sections */
      . = ALIGN(4);
    _e_nonboot_data = .;        /* define a global symbol at data end */
  } >RAM  AT >FLASH_APP

  /* used by the main stub C code to initialize data */
  _si_nonboot_data = LOADADDR(.all_nonboot_data);

  /* Uninitialized data section for rest of xxx */
/* This stuff is zeroed by C code in the main stub */
  . = ALIGN(4);
  .all_nonboot_bss :
  {
      _s_nonboot_bss = .;          /* define a global symbol at bss start */
    *(.bss .bss* .COMMON .common .common*)
    . = ALIGN(4);
    _e_nonboot_bss = .;          /* define a global symbol at bss end */
  } >RAM
 

Hence I wondered what section an initialised struct (a local one, thus on the function stack) is supposed to be in, but the answer must be None. It cannot possibly be anything like that, because the function would not be thread-safe.

This code sets up data and bss

Code: [Select]

// Entry point for code called by the boot block - has to be located at the base of FLASH+32k
// This is done in linkfile
// Has to be "int" otherwise compiler complains :)
// Various measures to prevent it getting optimised away, which probably don't work. What does work
// is vector2 (which is in assembler and thus immune to optimisation) having a dword pointing to main().

__attribute__((optimize("O0")))
__attribute__((used))

int main()
{

// Initialise DATA

extern char _s_nonboot_data;
extern char _e_nonboot_data;
extern char _si_nonboot_data;
void * memcpy (void *__restrict, const void *__restrict, size_t);
memcpy(&_s_nonboot_data, &_si_nonboot_data, &_e_nonboot_data - &_s_nonboot_data);

// Zero BSS and COMMON

extern char _s_nonboot_bss;
extern char _e_nonboot_bss;
void * memset (void *, int, size_t);
memset(&_s_nonboot_bss, 0, &_e_nonboot_bss - &_s_nonboot_bss);

// Go to user program

main_real();

// We should never get here

for (;;);
}


// Vectab2.s ends up linked here, at base of FLASH + 32k + 512

Traditional ST-supplied code uses the famous startupxxx.s asm code to do the above, but that works only for the first part of the code (my boot block in this case). Any code loaded later (the above main()) has to be separately linked and has to set up its own sections.

Right now I have more fun running an STILNK V3 from one Cube on the PC and switching to another Cube in a VM :) The Cube in the VM was a later version (for testing; I always check a new Cube generates the same binary byte for byte) and it decided to upgrade the STLINK, and due to a dodgy USB connection it bricked it. So now I am looking for an unbricking utility...
« Last Edit: March 15, 2023, 06:25:10 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 560
  • Country: sk
Re: A question on GCC .ld linker script syntax
« Reply #51 on: March 15, 2023, 09:06:27 am »
Quote
Hence I wondered what section an initialised struct (a local one, thus on the function stack) is supposed to be in, but the answer must be None. It cannot possibly be anything like that, because the function would not be thread-safe.
"sections" apply to variables allocated at compile/link time. Local non-static variables are placed to stack dynamically, at runtime. If they are explicitly initialized, the compiler generates the initialization code at the point of initialization (optimizer may decide to modify this in ways which are equivalent).
Quote from: C99, 6.2.4
4 An object whose identifier is declared with no linkage and without the storage-class
specifier static has automatic storage duration.
5 For such an object that does not have a variable length array type, its lifetime extends
from entry into the block with which it is associated until execution of that block ends in
any way. (Entering an enclosed block or calling a function suspends, but does not end,
execution of the current block.) If the block is entered recursively, a new instance of the
object is created each time. The initial value of the object is indeterminate. If an
initialization is specified for the object, it is performed each time the declaration is
reached in the execution of the block; otherwise, the value becomes indeterminate each
time the declaration is reached.
I don't know and I don't want to know what's "thread safe".

JW
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #52 on: March 15, 2023, 05:16:29 pm »
Quote
I don't know and I don't want to know what's "thread safe".

Re-entrant code. Everything needs to either be on the stack, or, as was done in the old days of multiuser operating systems (and 24k of core on an ICL1904), the caller supplies an index and the called function keeps its variables offset by that index.

My "overlay" code is running but weird stuff is happening, suggesting uninitialised variables or some such.

The sprintf thing was a redherring. The format string is there and is null-terminated btw

Code: [Select]
sprintf(datetime, "%c%c%c %02d%02d20%02d %02d%02d%02d DOY=%03d",
Does anyone know if it is even possible to use an STLINK debugger to load code linked to run at 0x8008000 (instead of 0x8000000) and it programs the right cpu flash addresses? IOW, does not try program anything into 0x8000000-0x8007fff).
« Last Edit: March 15, 2023, 05:47:59 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 560
  • Country: sk
Re: A question on GCC .ld linker script syntax
« Reply #53 on: March 15, 2023, 05:59:13 pm »
Does anyone know if it is even possible to use an STLINK debugger to load code linked to run at 0x8008000 (instead of 0x8000000) and it programs the right cpu flash addresses? IOW, does not try program anything into 0x8000000-0x8007fff).
I use it in that way every day.

Not from CubeIDE, though - bare gdb+OpenOCD. But I don't see why that wouldn't work in CubeIDE either.

Maybe you could try to use simpler tools - STLink Utility or CubeProgrammer - just to see if it makes any difference.

[EDIT] CubeIDE may work out of the .elf, so you may want to have a look at what exactly is in the .elf you are generating (perhaps using objdump).

JW
« Last Edit: March 15, 2023, 06:03:28 pm by wek »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #54 on: March 15, 2023, 06:23:29 pm »
Cube does use the ELF file, apparently.

But this problem is simpler:

Code: [Select]

void xxx_get_date_time(char *datetime)
{

RTC_TimeTypeDef sTime = {0};
RTC_DateTypeDef sDate = {0};

HAL_RTC_GetTime(&hrtc, &sTime, RTC_FORMAT_BIN);
HAL_RTC_GetDate(&hrtc, &sDate, RTC_FORMAT_BIN);

int i = (sDate.WeekDay - 1) * 3;

sprintf(datetime, "%c%c%c %02d%02d20%02d %02d%02d%02d DOY=%03d",
d_o_w[i],d_o_w[i+1],d_o_w[i+2],sDate.Date, sDate.Month, sDate.Year,
sTime.Hours, sTime.Minutes, sTime.Seconds, day_of_year(sDate.Date, sDate.Month-1, sDate.Year+100));

}

In the above, the sprintf() is being passed a load of uninitialised data (0xffffff....) for the format string.

Code: [Select]
080582ee:   str     r6, [sp, #12]
080582f0:   str     r5, [sp, #8]
080582f2:   str     r4, [sp, #4]
080582f4:   str.w   r8, [sp]
080582f8:   ldr     r3, [sp, #32]
080582fa:   mov     r2, r7
080582fc:   ldr     r1, [pc, #20]   ; (0x8058314 <xxx_get_date_time+168>)
080582fe:   ldr     r0, [sp, #36]   ; 0x24
08058300:   bl      0x8048a50 <sprintf_>

r1=0x8071870 (the format string address, fairly obviously) and contains a load of FFs
r0=0x2001ff20 and this is the right address on the stack for the sprintf output buffer


Looking in the .map file around 0x8071870 I see

Code: [Select]
.rodata.str1.4
                0x000000000807157c      0x2c8 ../LIBxxx\libxxx.a(xxx_NTP.o)
                                        0x2f0 (size before relaxing)
 .rodata.str1.4
                0x0000000008071844       0x58 ../LIBxxx\libxxx.a(xxx_rtc.o)
 .rodata.str1.4
                0x000000000807189c       0x1a ../LIBxxx\libxxx.a(xxx_sensors.o)
 *fill*         0x00000000080718b6        0x2

so 8071870 is in the right module (RTC) but hey this stuff is coming out of a .a library! I wonder if that library is simply missing all DATA sections, and contains only TEXT?

The same code runs if rtc.o is from a .o file, but not if rtc.o is from a .a library < bang head again >

The library is made with

arm-none-eabi-ar rvs libxxx.a @libxxx_objs.txt

I wonder if "rvs" does just TEXT? I want to build that lib with everything in the .o files (listed in libxxx_objs.txt).

A lot on google but not relating to the AR library creator tool. Its options don't appear to include one which might strip stuff out of .o files. Or maybe the linker is stripping it out?

This is crazy. I could use .o files instead of the stupid library. But in Cube there is no evident linker option to fetch .o files from a path. This is available only for a library. Perhaps I could merge all .o files into one big .o file.

The string in question is

Code: [Select]
sprintf(datetime, "%c%c%c %02d%02d20%02d %02d%02d%02d DOY=%03d",
and the other possibility is that the linkfile is not pulling this string (which I presume is DATA) out of a library, but it obviously does pull it out of a .o file.

A hex dump shows the format string is in the library:



« Last Edit: March 15, 2023, 09:31:00 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline eutectique

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: be
Re: A question on GCC .ld linker script syntax
« Reply #55 on: March 15, 2023, 09:44:04 pm »
.rodata* is read-only data, as the name implies. Also, the placement is correct, inside the flash address space. Hence, nothing related to DATA or .data.

Next, place the breakpoint at sprintf in question, and when it is hit, inspect the registers. Or step into sprintf in disassembly mode, and inspect the registers. Anything suspicious? Where r1 points?

Next, inspect the memory at address in r1. Not the map file, not elf, not binary or any other file, but the real memory. What is there? Do you see your string?
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #56 on: March 15, 2023, 09:59:59 pm »
I need to find the section name in which that sprintf format string is being placed in the .a library.

I can't find a tool which can read a .a lib file and list all the stuff in detail.

If that format string was rodata, it should be collected here

Code: [Select]
 
/* This collects all other stuff, which gets loaded into FLASH after xxx_main.o above */
  .code_constants_etc :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _e_code_constants_etc = .;        /* define a global symbol at end of code */
} >FLASH_APP

I did inspect the contents of the addresses in r0 and r1. The format string is all 0xFF. So it isn't being extracted out of the library (but is extracted from a .o).

I need to dump the library.

Somebody posted somewhere that 7-zip can be used to dump a .a library. I tried that - it is BS.

But it gets better. that format string is found in the final binary file too. Yet the wrong address is being passed to sprintf.
« Last Edit: March 15, 2023, 10:57:40 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline eutectique

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: be
Re: A question on GCC .ld linker script syntax
« Reply #57 on: March 15, 2023, 11:59:42 pm »
 

Offline eutectique

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: be
Re: A question on GCC .ld linker script syntax
« Reply #58 on: March 16, 2023, 12:05:57 am »
It should come with the toolchain, mine are named arm-none-eabi-nm (I've got several versions of them installed)
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #59 on: March 16, 2023, 08:50:38 am »
Yeah... much better now. The main problem was that the rodata stuff was at the very end of the overlay, and the last block (32k) of the overlay was not being programmed into the cpu flash because I was counting one block too few in the loader programming code ;)

This took a bit of sorting out because even after an initial fix the loader code was still not programming the last few bytes, which happened to be DATA or RODATA or some such.
« Last Edit: March 16, 2023, 04:14:07 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #60 on: March 18, 2023, 12:28:19 pm »
This is in case somebody comes this in the future.

Using libraries with a ST HAL project is a complete total waste of f*****g time, because as wek pointed out, a weak in a .o cannot be overriden by a non-weak in a .a

Some of this is easy to take care of but the ST code is full of "software purist" kind of total crap like this

Code: [Select]

/**
  * @brief  UART MSP Init.
  * @param  huart pointer to a UART_HandleTypeDef structure that contains
  *                the configuration information for the specified UART module.
  * @retval None
  */
 __weak void HAL_UART_MspInit(UART_HandleTypeDef *huart)
{
   /* Prevent unused argument(s) compilation warning */
  UNUSED(huart);
  /* NOTE: This function Should not be modified, when the callback is needed,
           the HAL_UART_MspInit could be implemented in the user file
   */
}

so I wasted a day finding why in a project using a xxx.a library the UARTs were not working. It was only when stepping through their config register config that I found this.

Why do people write this shit? You spend so much time typing up empty functions with __weak on the front.

The above weak function does exactly nothing. The project will compile but will obviously not run.

One approach is to not use libraries at all. That implies that all .c sources have to be present, which is in some variations of my project undesired, but just having the .o files is no good because the Cube build scripts do a del *.* in the project build directory. I suppose one could have the .o files (the ones without source) with a R/O attribute ;)

Another one is to go through all __weak declarations and comment them out (135 of them in the ST code I have!). That is what I am doing. Well, just the ones affecting my project.

Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 560
  • Country: sk
Re: A question on GCC .ld linker script syntax
« Reply #61 on: March 18, 2023, 12:39:19 pm »
> Another one is to go through all __weak declarations and comment them out (135 of them in the ST code I have!). That is what I am doing. Well, just the ones affecting my project.

Do you intend to supply your version of CubeF4 (= the "library"), and resolve conflicts stemming from users wishing to use another version of that "library"?

I did some googling and the answer here appears to have a solution. I of course don't know what are the exact ramifications of using that switch, especially  in your particular setting.
Quote
I think it should be noted that the --whole-archive option also adds all the symbols within the given libraries, which will solve this problem but will increase the size of the executable and may produce additional linking errors.

JW
« Last Edit: March 18, 2023, 12:41:18 pm by wek »
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9653
  • Country: fi
Re: A question on GCC .ld linker script syntax
« Reply #62 on: March 18, 2023, 12:56:45 pm »
Some of this is easy to take care of but the ST code is full of "software purist" kind of total crap like this
...
Why do people write this shit? You spend so much time typing up empty functions with __weak on the front.

People write confusing bullshit boilerplate so that
A) they can be replaced by ChatGPT
B) those who got replaced can complain how ChatGPT took their jobs
C) those who replaced them can boast how excellent ChatGPT is as it can write human-level code.

Expect things to get worse in the future. Much worse. Amount of human time spent in counterproductive work was at least physically limited. Now counterproductive programmers can get much more "productive".

Quote
One approach is to not use libraries at all.

Excellent approach. Highly recommended. Libraries are always a pain; only use them when they solve non-trivial problems, so that the work spent understanding the library and getting it work is paid back as saved time. For example, if you need TLS, use a library. If you need UART on MCU, do not.

EDIT: almost forgot. I'm brilliant and you're stupid  :)
« Last Edit: March 18, 2023, 12:59:29 pm by Siwastaja »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #63 on: March 18, 2023, 01:20:56 pm »
Quote
I did some googling and the answer here appears to have a solution

Yes I saw that earlier and didn't want to go down another rabbit hole :)

Quote
Do you intend to supply your version of CubeF4 (= the "library"), and resolve conflicts stemming from users wishing to use another version of that "library"?

My "library" is just code which I (well mostly I) have written.

The other sources for this project are mostly the ST ports of LWIP, FreeRTOS, etc. These do all work and have been well tested. All are in source form. It is actually quite old stuff e.g. LWIP goes back > 15 years. So there is no urgent operational need to update any of these. I mean, if your PC runs fine, are you going to update the BIOS?

If somebody wants to update any of these, they will be able to if they know their way around the directory structure and the port code (the glue code used to port say FreeRTOS to the 32F417). With me having hacked out various __weak functions, the new code will not just drop in. But it never just drops in anyway... whenever one is updating some chunk of code v1.3 with v1.4 one has to do a file compare and check that the stuff being changed isn't going to break something. For example I have MbedTLS v2.something which was modded to read a whole certificate chain from a file, so updating TLS to v3 will be a few days' work. But there is nothing wrong with v2 unless you are one of the "deprecated crypto BS experts". This is IOT; it is not a server for microsoft.com :)

If the new module is object-only then you are out of luck if it contains __weak functions which don't work. Well, the ST supplied libc.a had loads of weaks which were empty stubs... I had to unweaken the whole libc.a using objcopy and then they could be replaced. So there is a way forward.

Truth is... you can rarely just update a load of sourcecode by dropping in a new version. From what I have seen, it is always a ton of work. Most open source stuff has loads of issues, starting with zero support, so you spend days googling to find out what somebody else did in 2009.

Another way forward would be to use just .o files (no lib) and edit the Cube pre-build script which does the del *.* but that will break when a new version of Cube is installed. I looked at this briefly but it's another rabbit hole.

Quote
For example, if you need TLS, use a library. If you need UART on MCU, do not.

Actually TLS is supplied in source form :) I would have nothing to do with a non-source "library" unless the interface is well defined and "simple". MbedTLS is probably not a good example due to the complex interface and functionality.

What I now need is an editor macro which puts #if 0 and #endif around any function which is __weak :)

The worst part of this __weak template business is that there is easy potential for code to break in subtle ways. All you need to do is move a function from a .c/.o to a .a and it will break silently.

Quote
People write confusing bullshit boilerplate so that
A) they can be replaced by ChatGPT
B) those who got replaced can complain how ChatGPT took their jobs
C) those who replaced them can boast how excellent ChatGPT is as it can write human-level code.

I thought there was a real reason for __weak but I can't see one, apart from being able to build a project, which doesn't have to actually work.

Is there a tool which one can run on a whole project and which can report which functions are __weak but are actually being called by real code?
« Last Edit: March 18, 2023, 05:55:08 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 851
  • Country: es
Re: A question on GCC .ld linker script syntax
« Reply #64 on: March 18, 2023, 11:16:02 pm »
> Is there a tool which one can run on a whole project and which can report which functions are __weak but are actually being called by real code?

nm shows symbol types inside an ELF, but it can't tell if the function is really gets called AFAIK.
nm my_project.elf | grep -e " W " -e " w " -e " v " -e " V " should dump all weak symbols.
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #65 on: March 19, 2023, 06:45:40 am »
There are tools for checking dodgy C syntax so I am amazed there aren't tools for checking if empty functions (optionally but not always __weak) are getting called.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #66 on: March 19, 2023, 09:46:00 am »
Is there a tool which one can run on a whole project and which can report which functions are __weak but are actually being called by real code?
For 'are actually being called', no.  For 'may be called by real code', sure.

The difference is this:
    if (prng() == 42) foo();
Is foo() called or not?

If you want to know whether foo() really gets called, you have in your hands a variant of the classic halting problem.  The only way to know is to instrument foo() and run it.  For some specific cases, it is possible to instrument the internal state (variables, function result values) that affect whether the function gets called, and prove the conditions when the function gets called and when not.

However, if we want to know if the object file defines functions that are not called and whose addresses are never taken (noting that some function attributes like __attribute__((constructor)) and __attribute__((destructor)) actually cause the compiler to take the address of the function; it does not need to be explicitly taken), sure!  ld itself does exactly that when you compile stuff with -ffunction-sections -Wl,--gc-sections, to discard unneeded functions.

The procedure differs a bit when one uses dynamic linking (for stuff running under a proper OS with virtual memory), so I'll specifically limit to statically linked stuff, since microcontroller stuff is almost always statically linked.

You just generate a list of all function type symbols in the object files the project generates, the intermediate ELF .o files.  We then go through the list, and check that each defined function symbol is referenced, or it is unused; and that each referenced function symbol is defined at least once, or it is used but not implemented.  Simples!

There are two ways you can do this.  The simple way is to parse objdump -t output.  The other way is to parse the ELF files yourself.  In the latter case, you actually only need to find the symbol and string tables, since they contain all the information needed.  (All string-form data is used by reference, so it is a bit annoying; but the format is utterly stable.  On Linux, you can just use the already provided <elf.h>.)

In the objdump -t output, we are interested only in the SYMBOL TABLE, the lines that begin with a hexadecimal digit.  On elf32-littlearm (Armv7e-m Cortex-M7), this has five fields per line, with the second field being fixed-width, containing flags.  F denotes functions, and w denotes weak symbols; l (lowercase letter L) denotes local symbols, and g denotes global symbols.  We only need the second, third, and fifth fields.
When the second field has F and g in it, it defines a function symbol (a weak one if there is also a w).  We need these in one list.
When the second field is all spaces and the third field is *UNK* (exact string, not a pattern), it means that something in that object file references that symbol.  We want these in a separate list.
Using your favourite scripting language, even Bash works fine for this if you use export LANG=C LC_ALL=C to explicitly set the default C locale, extract the two cases into separate lists or dictionaries.  Then, it is just a matter of looking up which functions aren't referenced.

In C, a hash table on the function symbol name, with the data specifying whether it has been defined (and optionally in which object file or files and with which attributes), and whether it has been referenced (and optionally in which object file or files), should work very well and be extremely fast (bound by I/O bandwidth and latencies).

If you are asking for a tool that you can just install and run, I can't help you with those.  They may or may not exist; I do not know.
 
The following users thanked this post: peter-h

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 304
  • Country: ua
Re: A question on GCC .ld linker script syntax
« Reply #67 on: March 19, 2023, 11:44:38 am »
The other sources for this project are mostly the ST ports of LWIP, FreeRTOS, etc. These do all work and have been well tested.

Out of curiosity. When you say - well tested, what does that mean?
Do you test manually? Or is it an automatic unit test? Do you use a real hardware / jig to test? What is your test procedure?
Open source embedded network library https://mongoose.ws
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #68 on: March 19, 2023, 07:42:27 pm »
1) The stuff has been around a very long time, and been in fairly active development for years of that time.
2) The product has been in development for years and a number of boards have been running 24/7, running a fully loaded product.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #69 on: March 20, 2023, 10:37:13 am »
Quote
The difference is this:
    if (prng() == 42) foo();
Is foo() called or not?

What I had in mind is purely static analysis.

For example there is clearly a "decision point" during the link process where a __weak function is not overriden by a symbol which originates from a library, whereas it would be overriden by a symbol which originates from a .o file.

So the linker could in principle generate such a report.

Incidentally in my project there should not be any scenario where both a ex .o and a ex .a symbol are presented against a __weak function. But that scenario could also be detected by the linker.

As I say this is purely static analysis.

The danger I am trying to prevent is that in a project you may have 10k functions, of which 500 are __weak, and of the 500 there are 10 which are not defined by any real code, and this is not discovered because those 10 lie in some code which happens to not get tested. It's pretty nasty.

ST have generated __weak functions all over the place. I have now commented out everything I could find that I know is actually or potentially used in my project. But that's stupid.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #70 on: March 20, 2023, 12:05:01 pm »
Quote
The difference is this:
    if (prng() == 42) foo();
Is foo() called or not?

What I had in mind is purely static analysis.
Ok, so basically whether there is a reference to (a function) or not; and not whether that reference gets actually used at some point.

It'd be no problem to write a script to do this.  What's your development OS?  Do you have Bash available?  Or would you prefer Awk?  (Any of Bash, Mawk, or Gawk would make this quick'n'easy.)

I've got a nasty cold right now, so I cannot concentrate too well, thus cannot promise I can right now write it as a standalone ANSI C program (+specific-width int types from stdint.h) without any external dependencies.  It is, however, completely unlikely that the nm/objdump output format would change for Cortex-M7, so a script implementation should be perfectly fine for your needs.

For example there is clearly a "decision point" during the link process where a __weak function is not overriden by a symbol which originates from a library, whereas it would be overriden by a symbol which originates from a .o file.
Quite.  As wek mentioned in #25, symbol lookup in an archive differs from that of symbol lookup in an object file, by definition.   (That is, when an object file contains a weak symbol, other object files are checked for a non-weak version of that symbol, but archives are not.  It is a design decision, based on the use cases that weak symbols were invented to allow.)

When that is undesirable, you can tell the linker to treat the archive as a collection of object files instead (see --whole-archive in man 1 ld).  So, at compile/link time, instead of -lfoo to link in libfoo.a, you use -Wl,--push-state,--whole-archive -lfoo -Wl,--pop-state (typically in LDFLAGS).

So the linker could in principle generate such a report.
Certainly, yes.

It is easy to do it as a separate pass after all object files have been generated, but not linked into the final binary, too.

At that point, you can even use objcopy to move symbols/objects between object files, rename symbols, and delete symbols (for example, if you determine the archive/library version is superior than the one in any of the object files, and so on); but I'd restrict do deleting unneeded symbols if possible, or even better, use the whole-archive option above.

ST have generated __weak functions all over the place.
It makes sense, if that code is to be compiled and linked into a library.  Then, users can easily override any of such functionality by defining non-weak versions of those functions in their object code.

Your problem is that you put the user code in a library, and the ST code in a bunch of object files.  Basically the inverse of what ST generates the code for.  It is not an arbitrary choice one can make.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #71 on: March 20, 2023, 02:02:41 pm »
Win7-64. However I have cygwin and so have bash and gawk, but not awk. 267 executables ...

I have a dir called awk but it contains grcat and pwcat.

AFAICT the ST code was never intended for a library. It is in source form and is meant to be loaded into Cube IDE in that form. I suppose somebody might have then made a .a lib out of it but I haven't come across that. Can't see the point of a lib if you have the sources, ever.

They supply newlib printf etc as a lib and w/o sources and that lib is not weak ;) As previously posted, I had to weaken it to replace the printf code.


« Last Edit: March 20, 2023, 02:04:59 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4615
  • Country: gb
Re: A question on GCC .ld linker script syntax
« Reply #72 on: March 20, 2023, 03:05:49 pm »
1) The stuff has been around a very long time, and been in fairly active development for years of that time.
2) The product has been in development for years and a number of boards have been running 24/7, running a fully loaded product.

that's - ironically - weak(1) safety  :D

edit:
(1) necessary, but not sufficient
« Last Edit: March 20, 2023, 03:12:54 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #73 on: March 20, 2023, 03:15:38 pm »
Yes of course I agree that I should spend next 100 years of my life writing it all myself :)

A lot of corporate users buy in something commercial and get a nice warm feeling that way.

I have no illusions that a lot of open source stuff is crap. Probably 90%. But a lot of it is good, simply because of a large user base. The commercial products are often written by one guy anyway; the companies correspond as "we" but it is just one person doing it all :) The user base will only be according to how many they have sold, but there is plenty of competition in the non-free sector.

It's easy to write "weak safety" but do you want to work for me for a few years for free? Too many one-liners.

Fortunately, embedded products can have a relatively narrow application sphere, so can be adequately tested. With a PC etc you could not ever do that.

Writing a test harness for a TCP/IP stack is a huge job. By the time you have done one which explores all the areas where skeletons are buried, you can write it all yourself.
« Last Edit: March 20, 2023, 05:09:11 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4615
  • Country: gb
Re: A question on GCC .ld linker script syntax
« Reply #74 on: March 20, 2023, 07:55:27 pm »
It's easy to write "weak safety" but do you want to work for me for a few years for free? Too many one-liners.

I meant, there are specific skills and specific tools in software testing. You learn the basics from courses like "software engineering" (engineering university courses, second, third and fourth year), you refine and master it later from seniors, and training on things like DO178B.


The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #75 on: March 22, 2023, 08:37:28 am »
Back to the topic, I think I managed to catch all relevant __weak functions which are used but not being replaced with real code, but I am not 100% sure.

They are easy enough to find with a Search, without even using objcopy etc to list them. If you have say 100 of them then it may take you some hours to see if any don't tie up, and put
#if 0
#endif
around any suspicious ones, and this is safe because the linker will complain if you went too far.
« Last Edit: March 22, 2023, 10:38:49 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #76 on: March 22, 2023, 11:54:02 am »
Win7-64. However I have cygwin and so have bash and gawk, but not awk. 267 executables ...

I have a dir called awk but it contains grcat and pwcat.
Okay.  I don't use Windows, so the following script might need massaging wrt. path separator stuff.

AFAICT the ST code was never intended for a library. It is in source form and is meant to be loaded into Cube IDE in that form. I suppose somebody might have then made a .a lib out of it but I haven't come across that. Can't see the point of a lib if you have the sources, ever.

They supply newlib printf etc as a lib and w/o sources and that lib is not weak ;) As previously posted, I had to weaken it to replace the printf code.
Heh, so it is cargo cult programming on behalf of ST, then.  Not a big surprise, though; Enterprise-grade code is rarely of high (or even medium) quality.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #77 on: March 22, 2023, 02:04:02 pm »
Here is the script I cobbled together.  It is not intended to be minimal or the fastest possible; I wanted it to be as easily modified and adapted as possible.
Save it somewhere as say checksym.sh or something, make it executable, and specify the object and archive names on the command line.
Code: [Select]
#!/bin/bash
# SPDX-License-Identifier: CC0-1.0
# -*- coding: utf-8 -*-

OBJDUMP="${OBJDUMP:-arm-none-eabi-objdump}"
AWK="${AWK:-gawk}"
SED="${SED:-sed}"
SORT="${SORT:-sort}"
XARGS="${XARGS:-xargs}"
TOUCH="${TOUCH:-touch}"

# Use default locale, since we're parsing command outputs
export LANG=C LC_ALL=C

# Create an auto-removed temporary directory for our temporary files
Work="$(mktemp -d)" || exit 1
trap "rm -rf '$Work'" EXIT

# We list all object files in "$Work/objfiles",
# and all libraries/archives in "$Work/libfiles",
# one name or path per line.
printf '%s\n' "$@" | "$SED" -ne '/\.o$/p' > "$Work/objfiles"
printf '%s\n' "$@" | "$SED" -ne '/\.a$/p' > "$Work/libfiles"
# Note that you can replace this with e.g.
#   find . -name '*.o' -printf '%p\n' > "$Work/objfiles"
#   find . -name '*.a' -printf '%p\n' > "$Work/libfiles"
# Feel free to replace the above with whatever mechanism you like.

# To ensure we only look at each library file once, we sort the list.
# The sed is magic: it changes to tab separators, with symbol table
# having five entries (addr,flags,section,addr,name), and file names
# two entries (object,archive):
"$SORT" -u "$Work/libfiles" | "$XARGS" -r -d '\n' "${OBJDUMP}" -t \
 | "$SED" -e '/^$/d ; /^SYMBOL TABLE:/d ; s|^In archive \(.*\): *$|\t\1|; s|: \+file format .*$|\t|; s| \([^ \t]\+\t[0-9A-Fa-f]\+\) |\t\1\t| ; s|^\([0-9A-Fa-f]\+\) |\1\t|' \
 > "$Work/symbols"
#
# Object files are handled in a very similar manner, with the
# only exception being that there are no archive file names, and object file names are single-field records.
"$SORT" -u "$Work/objfiles" | "$XARGS" -r -d '\n' "${OBJDUMP}" -t \
 | "$SED" -e '/^$/d ; /^SYMBOL TABLE:/d ; s|: \+file format .*$||; s| \([^ \t]\+\t[0-9A-Fa-f]\+\) |\t\1\t| ; s|^\([0-9A-Fa-f]\+\) |\1\t|' \
 >> "$Work/symbols"

#
# We use awk to process the combined symbol list.  There are four types of lines/records, with TAB separators:
#   object-file-name                        Names an object file as the source for the following symbols
#   object-file-name<TAB>                   Names an object file within the current archive/library for the following symbols
#   <TAB>archive-file-name                  Names the archive/library for the following symbols
#   hex<TAB>flags<TAB>section<TAB>symbol    Names a symbol.  Flags are per objdump -t. References name *UND* as their section.
"$AWK" 'BEGIN {
            FS = "\t";
            split("", funs)
            split("", objs)
            split("", refs)
            aname = ""  # Archive file name
            oname = ""  # Object file name within an archive
            fname = ""  # File name (or combined aname ":" oname)
        }

        NF==1 { # Solitary object file name
            aname = ""
            oname = ""
            fname = $1
        }

        NF==2 { # Archive file
            if (length($2) > 0) {
                aname = $2
                oname = ""
                fname = aname
            } else
            if (length($1) > 0) {
                oname = $1
                fname = aname ":" oname
            }
        }

        NF==5 { # Symbol table record

            # Only consider symbols that start with _ or a letter
            if (!($5 ~ /^[_A-Za-z]/)) next;

            # Skip local, debug, dynamic, indirect, file, and warning symbols
            if ($2 ~ /[lWIiDdf]/) next;

            # Skip common and absolute-address stuff
            if ($3 == "*COM*" || $3 == "*ABS*") next;

            # If the symbol or reference is weak, we prefix the symbol name with !.
            if ($2 ~ /w/) {
                weak = 1
                sym = "!" $5
            } else {
                weak = 0
                sym = $5
            }

            if ($3 == "*UND*") {
                # Symbol reference. Add file name to refs under this symbol.
                if (sym in refs)
                    refs[sym] = refs[sym] "\t" fname
                else
                    refs[sym] = fname
            } else {
                if ($2 ~ /F/) {
                    # Function definition
                    if (sym in funs)
                        funs[sym] = funs[sym] "\t" fname
                    else
                        funs[sym] = fname
                } else {
                    # Non-function definition
                    if (sym in objs)
                        objs[sym] = objs[sym] "\t" fname
                    else
                        objs[sym] = fname
                }
            }
        }

        END {

            # Find strong function definitions defined in more than one file
            split("", syms)
            for (sym in funs)
                if (!(sym ~ /^!/) && (funs[sym] ~ /\t/))
                    syms[sym] = funs[sym]
            if (length(syms) > 0) {
                printf "%d duplicate (non-weak) function definitions:\n", length(syms)
                for (sym in syms)
                    printf "  %s in %s\n", sym, syms[sym]
                printf "\n"
            } else {
                printf "There are no duplicate (non-weak) function definitions.\n\n"
            }

            # Find weak function definitions without corresponding strong symbol definitions
            split("", syms)
            for (wsym in funs) if (wsym ~ /^!/) {
                sym = substr(wsym, 2)
                if (!(sym in funs))
                    syms[sym] = funs[wsym]
            }
            if (length(syms) > 0) {
                printf "%d weak functions without strong function definitions:\n", length(syms)
                for (sym in syms)
                    printf "  %s defined in %s\n", sym, syms[sym]
                printf "\n"
            } else {
                printf "All weak function definitions have corresponding strong function definitions.\n\n"
            }

            # Find strong function symbols that are never referenced
            split("", syms)
            for (sym in funs) if (!(sym ~ /^!/)) {
                wsym = "!" sym
                if (!(sym in refs) && !(wsym in refs))
                    syms[sym] = funs[sym]
            }
            if (length(syms) > 0) {
                printf "%d (non-weak) functions that are never referenced:\n", length(syms)
                for (sym in syms)
                    printf "  %s defined in %s\n", sym, syms[sym]
                printf "\n"
            } else {
                printf "All (non-weak) functions are referenced at least once.\n\n"
            }

            # Find weak function symbols that are never referenced
            split("", syms)
            for (wsym in funs) if (wsym ~ /^!/) {
                sym = substr(wsym, 2)
                if (!(sym in refs) && !(wsym in refs))
                    syms[sym] = funs[wsym]
            }
            if (length(syms) > 0) {
                printf "%d weak functions that are never referenced:\n", length(syms)
                for (sym in syms)
                    printf "  %s defined in %s\n", sym, syms[sym]
                printf "\n"
            } else {
                printf "All weak functions are referenced at least once.\n\n"
            }

            # Find references that cannot be resolved
            split("", syms)
            for (ref in refs) {
                if (ref ~ /^!/) {
                    sym = substr(ref, 2)
                    wsym = ref
                } else {
                    sym = ref
                    wsym = "!" ref
                }
                if (!(sym in funs) && !(wsym in funs) && !(sym in objs) && !(wsym in objs)) {
                    if (sym in syms)
                        syms[sym] = syms[sym] "\t" refs[ref]
                    else
                        syms[sym] = refs[ref]
                }
            }
            if (length(syms) > 0) {
                printf "%d unresolved symbols:\n", length(syms)
                for (sym in syms)
                    printf "  %s referenced in %s\n", sym, syms[sym]
                printf "\n"
            } else {
                printf "No unresolved symbols.\n\n"
            }

        }' "$Work/symbols"
This has only been tested on Linux.

I used OBJDUMP, SED, etc. for the corresponding executables' pathnames.  The Bash substitution VAR="${VAR:-default}" uses the already set non-empty value, or default if none set or empty.  So, one can use e.g. bash -c 'AWK=some-other-awk-variant checksym.sh' to override the script-set value.

Since we parse objdump output, we set the default C locale, to ensure the output is not unexpectedly localized.  (Compare e.g. date and LANG=C LC_ALL=C date).

printf is a bash builtin, a bit nicer than echo.  It is used to save all object file names, one per line, to "$Work/objfiles", and all library or archive file names, one per line, to "$Work/libfiles".  These are separated only so that we can differentiate between standalone object files and object files within an archive in our output.

The reason we kick the names to a file, is so that huuuge projects with tens of thousands of files, can be supported.  On some systems, the number of command-line parameters is limited, you see.  If you encounter that limit, then this initial part of the script can be modified to pass library and object file names in a different way.  (-@file-name would be a common way we could use to specify a file containing object or archive/library file names.  We could also specify name patterns looked for in an entire subtree, via find.)

Application startup causes significant latencies in batch scripts.  To minimize this, and to ensure we only dump each library or object file once, we sort the file containing the names of files to be processed, and feed it to xargs, which executes objdump with as many parameters as is possible, for all files.

The objdump -t output is fed through a SED script, that manipulates the output in the following ways:
  • /^$/d;
    Deletes empty lines
  • /^SYMBOL TABLE:/d;
    Deletes lines beginning with SYMBOL TABLE:
  • s|^In archive \(.*\): *$|\t\1|;
    Replaces lines beginning with In archive and ending with a colon with a TAB and the archive name
  • s|: \+file format .*$|\t|;
    Replaces colon, space(s), followed by "file format", with a single TAB skipping the rest of the line.
  • s| \([^ \t]\+\t[0-9A-Fa-f]\+\) |\t\1\t| ;
    If there is a space followed by a token, tab, a hexadecimal number, and a space, replaces those spaces with TABs
  • s|^\([0-9A-Fa-f]\+\) |\1\t|
    Replaces the space following the first hexadecimal number with a TAB.
The object file sed is similar, except it omits the archive name, and the object file name itself is converted to a single-field record.
Both are emitted to $Work/symbols, which contains TAB-separated fields, one record per line.  It has
  • Records with two fields naming an object file within an archive.
    If the first field is empty, the second field names a new archive file.  If the second field is empty, the first field names the object file.
    I use the convention archive-file:object-file for these, in the above script
  • Records with just one field name a separate object file (not inside any archive).
  • Records with five fields specify a symbol table entry.
    First and fourth fields are hexadecimal numbers, and not interesting.
    Second field contains flag characters:
      g: Global
      u: Unique global
      !: Global and local
      w: Weak
      F: Function
    For others, see man 1 objdump, under -t or --syms option.
Finally, we feed the $Work/symbols to awk, which tracks the file name (for each record), adding symbol references (those with section *UND*) to refs[] array, function definitions to funs[] array, and other object definitions to objs[] array.  Weak symbols are internally differentiated by adding a ! in front of the symbol name.
Each of the three arrays (funs, objs, and refs) has the symbol name as a key, preceded with a ! if the symbol or reference is weak, and the value is a TAB-separated list of file names where the definition or reference occurs in.

The report is output in the END rule, which is triggered after all input records have been processed.

If any non-weak function symbol in funs has a value with a TAB in it, it is defined in more than one file.

If there is a weak function symbol in funs without a corresponding non-weak symbol (say, there is !foo but no foo), then we have a weak function symbol definition without a corresponding strong symbol definition.

Each key in funs must be defined in refs also, or that key (weak or non-weak function) is not referenced at all.  Such functions are either unnecessary, or unnecessarily global.  (They might be used in the same object file they are defined in; to resolve these, we'd need to look at the relocation table for that particular object file.)

If there is a key in refs, but no corresponding (weak or non-weak) key in funs or objs, we have a dangling, unresolvable reference.
« Last Edit: March 22, 2023, 02:06:14 pm by Nominal Animal »
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #78 on: June 30, 2023, 02:34:56 pm »
Unfortunately, after some hours on this, I am asking for help again with the unbelievable GCC LD syntax





Thank you in advance :)

If I simply comment-out line 271, I get a syntax error on line 281.

The purpose of those two blocks is to place initialised data for main.c into RAM first, and then place all other initialise data into RAM after that. I then repeat the exercise for BSS (main.c first, rest later).
« Last Edit: June 30, 2023, 02:37:05 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9653
  • Country: fi
Re: A question on GCC .ld linker script syntax
« Reply #79 on: June 30, 2023, 02:50:25 pm »
I have never seen either a dummy, empty output section entry (line 268) or anonymous output section (line 281) in any linker script file. Documentation ( https://sourceware.org/binutils/docs/ld/Output-Section-Description.html ) does show section name and braces as mandatory:

"The colon and the curly braces are also required."

I don't think there is anything "unbelievable" in this. What do you think omitting the content would do? It's like, in C, doing

Code: [Select]
    switch(var)

    go_on_with_program_forgetting_the_cases();

and then complaining the compiler is stupid.

If you comment line 271 out, then that part becomes valid because line 268 is now the output section name for braces starting at 272, but there is another error, at line 280 you would need another output section name.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #80 on: June 30, 2023, 03:07:59 pm »
Thanks.

The label all_nonboot_data was intended to facilitate
_si_nonboot_data = LOADADDR(.all_nonboot_data);
which in turn is used in

Code: [Select]
// Initialise DATA

extern char _s_nonboot_data;
extern char _e_nonboot_data;
extern char _si_nonboot_data;
memcpy(&_s_nonboot_data, &_si_nonboot_data, &_e_nonboot_data - &_s_nonboot_data);

I have now solved it (but need to examine the memory to make sure it is actually working) as follows

Code: [Select]

 
/* Initialized data sections for non boot block code. These go into RAM. LMA copy is loaded after code. */
/* This stuff is copied from FLASH to RAM by C code in the main stub */
. = ALIGN(4);

/* main.c stuff is loaded first, for RAM address consistency between factory and customer code */
  .XXX_main_data :
  {
    . = ALIGN(4);
    _s_nonboot_data = .;        /* create a global symbol at data start */
    KEEP(*(.XXX_main))
    *XXX_main.o (.data .data*)      /* .data sections */
    . = ALIGN(4);
  } >RAM  AT >FLASH_APP

/* Remaining DATA stuff */
.XXX_other_data :
  {
    . = ALIGN(4);
    *(.data .data*)      /* .data sections */
      . = ALIGN(4);
    _e_nonboot_data = .;        /* define a global symbol at data end */
  } >RAM  AT >FLASH_APP

  /* used by the main stub C code to initialize data */
  _si_nonboot_data = LOADADDR(.XXX_main_data);


  /* Uninitialized data section (BSS) for non block boot code */
/* This stuff is zeroed by C code in the main stub */
 
  .XXX_main_bss :
  {
    . = ALIGN(4);
    _s_nonboot_bss = .;        /* create a global symbol at BSS start */
    KEEP(*(.XXX_main))
    *XXX_main.o (.bss .bss* .COMMON .common .common*)      /* .bss sections */
    . = ALIGN(4);
  } >RAM
 
  /* Remaining BSS stuff */
 
  .XXX_other_bss :
  {
      . = ALIGN(4);
    *(.bss .bss* .COMMON .common .common*)
    . = ALIGN(4);
    _e_nonboot_bss = .;          /* define a global symbol at BSS end */
  } >RAM
 

I don't mind being called an idiot so long as I am offered a solution ;)

Those weird sections were in the original ST linkfiles.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9653
  • Country: fi
Re: A question on GCC .ld linker script syntax
« Reply #81 on: June 30, 2023, 03:29:01 pm »
That : thing is not just some arbitrary label, it's part of output section description syntax, and mandatory parts are output section name, the :, and {}.

Assignment using LOADADDR is completely different beast. https://sourceware.org/binutils/docs/ld/SECTIONS.html lists the four possible kinds of things that can appear in SECTIONS. Your example uses two of them, symbol assignments and output section descriptions.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #82 on: June 30, 2023, 03:54:52 pm »
This stuff is a whole new world. One spends years to understand how to write C at a basic level and then some months to learn how GCC linkfiles work.

I've been doing this stuff for 40 years and have never seen anything as tacky as GCC LD syntax. It's obvious from projects one finds online that most people just use the one which came with the development board (which is basically what I did, a few years ago) and they never change it.

It's a whole new paradigm, whole new philosophy to learn.

It seems to run OK and the memcpy and memset statements are processing the correct addresses

Code: [Select]
// Initialise DATA

extern char _s_nonboot_data;
extern char _e_nonboot_data;
extern char _si_nonboot_data;
memcpy(&_s_nonboot_data, &_si_nonboot_data, &_e_nonboot_data - &_s_nonboot_data);

// Zero BSS and COMMON

extern char _s_nonboot_bss;
extern char _e_nonboot_bss;
memset(&_s_nonboot_bss, 0, &_e_nonboot_bss - &_s_nonboot_bss);

There is still a strange problem:

I have these two sections

Code: [Select]

  .XXX_main_bss :
  {
    . = ALIGN(4);
    _s_nonboot_bss = .;        /* create a global symbol at BSS start */
    KEEP(*(.XXX_main))
    *XXX_main.o (.bss .bss* .COMMON .common .common*)      /* .bss sections */
    . = ALIGN(4);
  } >RAM
 
  /* Remaining BSS stuff */
 
  .XXX_other_bss :
  {
      . = ALIGN(4);
    *(.bss .bss* .COMMON .common .common*)
    . = ALIGN(4);
    _e_nonboot_bss = .;          /* define a global symbol at BSS end */
  } >RAM
 
 

The intention is to have BSS variables from XXX_main.o first and then all other BSS variables after that. But it isn't working, looking at their addresses in the .map file.

The allocation of variables in the .map file (in address order) is fairly random and not in the order of declaration in the .c file. I guess this is not guaranteed.
« Last Edit: June 30, 2023, 05:02:58 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #83 on: July 01, 2023, 04:31:54 pm »
I have another LD syntax question:



I get a LD syntax error at line 160.

Yet the same format is used elsewhere in the linkfile. Removing RAM AT removes the error but then I have code which is linked to run in FLASH and not in the RAM.

The intention is for this to work

Code: [Select]

// === At this point, interrupts and DMA must still be disabled ====
// Execute loader. Reboots afterwards.
// Parameters for loader are in the SSA, although they could have simply
// been passed as function parameters in loader_entry().

extern char _loader_start;
extern char _loader_end;
extern char _loader_ram_start;

// Copy loader code and its init data to RAM.
B_memcpy(&_loader_ram_start, &_loader_start, &_loader_end - &_loader_start);

// See comments in loader.c for why the long call.

extern void loader_entry() __attribute__((long_call));
loader_entry();

// never get here (loader always reboots)
for (;;);

It worked previously but the RAM location for the loader was being set by a MEMORY statement which generates complaints by the linker that it was overlapping some other stuff (which was true but physically irrelevant) and while I spent days looking for ways to specify the execution address for code in some other ways (including posts here) I never found anything which was supported by the arm32 GCC LD.

If there was a way to specify an execution address, either in the linkfile or in the .c file, that would do the job because when the loader is running I have the whole 128k/192k RAM to play with. The only place it cannot go is the CCM which cannot run code; I use it for the loader stack and various buffers.

An info much appreciated as always.

I thought that perhaps it doesn't like "text" being placed in RAM because at that point it can't know the execution address for the code, but removing the "text" parts does not change it.
« Last Edit: July 01, 2023, 04:34:03 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #84 on: July 01, 2023, 04:54:53 pm »
What is RAM?  Shouldn't that be
    } >RAM AT>FLASH_BOOT
per Output Section Attributes?
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #85 on: July 01, 2023, 06:30:53 pm »
Oh bugger I do apologise for something so simple!

It actually opened up a can of worms... the linkfile section is not collecting anything from b_loader.o

Code: [Select]

  /* b_loader.o - code goes last in the boot block (no special need for that) */
  /* code and initialised data is all lumped together */
  /* **** The loader must not use any BSS **** */
  /* The loader code is located to execute after the end of BSS above and gets */
  /* copied there in b_main */
     
  .b_loader_all :
  {
    . = ALIGN(4);
    _loader_ram_start = .;
    KEEP(*(.b_loader))
    *b_loader.o (.text .text* .rodata .rodata* .data .data*)
      . = ALIGN(4);
      _loader_ram_end = .;
  } >RAM AT>FLASH_BOOT
 
  _loader_flash_start = LOADADDR(.b_loader_all);
 
  /* this is just for .map file lookup */
 
  _loader_size = _loader_ram_end - _loader_ram_start;

Code: [Select]

.b_loader_all   0x000000002000fba0        0x0 load address 0x0000000008001ed8
                0x000000002000fba0                . = ALIGN (0x4)
                0x000000002000fba0                _loader_ram_start = .
 *(.b_loader)
 *b_loader.o(.text .text* .rodata .rodata* .data .data*)
                0x000000002000fba0                . = ALIGN (0x4)
                0x000000002000fba0                _loader_ram_end = .
                0x0000000008001ed8                _loader_flash_start = LOADADDR (.b_loader_all)
                0x0000000000000000                _loader_size = (_loader_ram_end - _loader_ram_start)

Notably _loader_size=0.

It is for this

Code: [Select]
extern char _loader_ram_start;
extern char _loader_ram_end;
extern char _loader_flash_start;

// Copy loader code and its init data to RAM.
B_memcpy(&_loader_ram_start, &_loader_flash_start, &_loader_ram_end - &_loader_ram_start);

I've had these constructs working before but this time there is something else. I did check that no earlier linkfile statement is referencing b_loader.o and stealing the code.

I think this block may be stealing the loader stuff

Code: [Select]
/* This collects all other stuff, which gets loaded into FLASH */
    .code_constants_etc :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _e_code_constants_etc = .;        /* define a global symbol at end of code */
} >FLASH_APP

but I can't put the loader block above this one because that would break something else.

Is there some way to ignore certain input e.g. ignore b_loader.o? That way a later block could pick it up. After a lot of time searching the LD manual and googling I found the EXCLUDE_FILE directive

Code: [Select]
 
/* This collects all other stuff, which gets loaded into FLASH */
    .code_constants_etc :
  {
 
      *(EXCLUDE_FILE(*b_loader.o) .text .text* .rodata .rodata* .data .data* )
     
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
*(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))
   
    . = ALIGN(4);
    _e_code_constants_etc = .;        /* define a global symbol at end of code */
} >FLASH_APP

but , hey, it doesn't work :) Either no error or another useless syntax error. The internet has loads of people trying the same thing and failing. It could be the ARM32 version of GCC LD doesn't support EXCLUDE_FILE.

EDIT: after spending many hours on this, I am sure that EXCLUDE_FILE doesn't work. Lots of examples on the web but not for ARM32 GCC LD, and some of them have typos so won't even compile. I think I have solved it now but only by a suitable section ordering.

There is also a /DISCARD/ directive but that discards the symbols permanently, there and then.
« Last Edit: July 02, 2023, 08:01:16 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 851
  • Country: es
Re: A question on GCC .ld linker script syntax
« Reply #86 on: July 02, 2023, 02:32:35 pm »
The answer is straight in the manual (again): https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html
EXCLUDE_FILE doesn’t affect subsequent definitions, so your *(.text) etc below it are catching everything. Just comment them out, the EXCLUDE_FILE line itself also collects all sections listed in it from other files not matching the exclusion pattern.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #87 on: July 02, 2023, 03:14:30 pm »
I tried the exclude after the other statements, too.

I've given up on this now. A re-ordered linkfile seems to have done the job. Even reading that manual section is double dutch to me. One has to have a really deep understanding of this stuff.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 851
  • Country: es
Re: A question on GCC .ld linker script syntax
« Reply #88 on: July 02, 2023, 04:18:08 pm »
The ordering of exclude vs other *(xx) entries doesn’t matter. EXCLUDE_FILES itself does collect sections from it’s section list from all other files excluding the files from it’s exclusion list. It does not tell other collection statements to exclude anything. If they are present too - they work as usual (collecting everything in your case), you need to remove them completely.
 
The following users thanked this post: peter-h

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #89 on: July 02, 2023, 05:05:00 pm »
OP did start a new thread on this latest sub-question, to which I constructed a rather long description of how to analytically go about building your own linker script, using a spreadsheet or plain text file first to model your address space.  I also posted an example and explanation of the EXCLUDE_FILE syntax.

I fully understand the frustration with trying to construct a linker script, before one truly groks the paradigm, the approach or model or idea behind how the linker is controlled.  Thing is, it isn't too complicated; it is just strange, in the sense that the hurdle is understanding how it is supposed to be used and how the language works.  Yes, it will take several hours to grok it, and yes, it is annoying as hell to have to do that, but unless you do, you will suffer when trying to do anything but the simplest changes to existing linker scripts.  (Me, I like to do experiments when learning.  In this case, one can create a really trivial "firmware" with just the minimum symbols – at least one for each section – but it does not need to be functional; then, starting with a minimal linker script, and making changes and storing the resulting map with the copy of the linker script, will let you [spend a lot of time and] learn the ins-and-outs of the syntax.  Which is nothing like C at all.)

Having the entire memory, address ranges and output sections and the linker defined symbols in a spreadsheet or diagram, does mean that anyone with sufficient familiarity with linker scripts can help with its implementation.  Working on just the linker script itself, we have the same age-old problem we have with programming without comments: we can see what the script is doing, but we don't know the actual intent; the spreadsheet or diagram provides that, if sufficiently well organized.
 
The following users thanked this post: peter-h, SiliconWizard

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #90 on: July 02, 2023, 07:45:31 pm »
FWIW my linker script is heavily commented. I spend half the time updating the comments.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #91 on: July 02, 2023, 08:55:54 pm »
FWIW my linker script is heavily commented. I spend half the time updating the comments.
From experience, I can say the comments are nothing compared to an actual memory map table or diagram.

I know you don't want to spend the "extra" time creating such a spreadsheet or table, and would prefer just to get it working and move on to more interesting things, but I promise, you will save time overall if you do the spreadsheet or table, and include the logic and descriptions from your linker script comments as musings in the same file as the spreadsheet or table.

I do not use Inkscape, Dia, LibreOffice Calc, Graphviz, etc. (and obviously pen and paper!) just because I like pretty pictures.  I use them because they make it possible for me to do things I would not be able to do without.  Even sketching each of the memory regions (flash, RAM, closely-coupled RAM) as a bar, and then marking regions with }-marks or hatches with lines to explanations will help a lot.  Just remember to include everything, and not leave subtle details to work out in the script, because it is exactly those subtle details that WILL bite you if you don't prepare for them.

Tools.  Use 'em.  Boil 'em, mash 'em, put 'em in a stew.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #92 on: July 03, 2023, 08:37:45 pm »
A little Q on this linkfile line:

   RAM (xrw)           : ORIGIN = 0x20000000, LENGTH = 128K /*FOR 32F417 */

Am I right in that the ORIGIN = sets up the start of where all the >RAM directives deposit the variable allocations? Obviously that is right, but is there anything else? I cannot find any reference to "RAM" anywhere except in the >RAM directives.

Also AFAICT the LENGTH = parameter is used purely to check for a) section overlap and b) generating the % bargraphs in the Build Analyser display. My code runs exactly the same if I put in LENGTH = 1024K which is nowhere near physically present.

But I wonder whether Cube might be digging these out and using them for something. I don't think so; I think it gets everything from the ELF file and from Cube Debug settings.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ejeffrey

  • Super Contributor
  • ***
  • Posts: 4090
  • Country: us
Re: A question on GCC .ld linker script syntax
« Reply #93 on: July 03, 2023, 09:25:06 pm »
It is mostly used to detect if you try to link something that doesn't fit.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #94 on: July 03, 2023, 10:42:03 pm »
A little Q on this linkfile line:

   RAM (xrw)           : ORIGIN = 0x20000000, LENGTH = 128K /*FOR 32F417 */

Am I right in that the ORIGIN = sets up the start of where all the >RAM directives deposit the variable allocations?
Yes, but only when there is no AT> (or there is a superfluous AT>RAM).

But I wonder whether Cube might be digging these out and using them for something.
Your toolchain includes objdump (it is part of binutils, just like ld is; they're part of the same package).  Try running
    objdump -fwh path-to-ELF-file
It will tell you the architecture, start address (ENTRY), and list all sections (if this is the linked ELF result, these are the output sections).
The VMA column contains the logical addresses (expected runtime addresses), whereas the LMA column contains the storage addresses (controlled by AT).

This, or the equivalent, is what Cube is looking at, and what the tool that generates the firmware .hex file from the ELF object file looks at.

is there anything else? I cannot find any reference to "RAM" anywhere except in the >RAM directives.
The access mode (Read, Write, eXecute) is stored in the ELF header file, but for microcontrollers without virtual memory etc. does not matter.  It is useful for us humans, I guess, though.

Technically, you don't need the MEMORY command, if you just specify the address and storage address for each output section; in that case, I'd use constant symbols to specify the start addresses, but I think it would just be messier and more text in the script.



Think of it this way:
    MEMORY { address space rules and memory region names }
    SECTIONS { output section definitions }
with each output section definition being
    outputsection { input section rules } >region
or
    outputsection { input section rules } >useregion AT>storageregion
(or one of the valid expressions involving symbols and the current output address).

If only region is specified, then useregion=region and storageregion=region.

useregion is where the linker assumes the contents (data or code) are during use.
storageregion is where the linker stuffs the data.

Most often you'll only see } >RAM AT>FLASH (in addition to normal } >region), because it tells the linker that "these output sections are stored somewhere in flash, but there is code that copies them to RAM before they are used".
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #95 on: July 04, 2023, 08:35:25 am »
Great; thanks.

It sounds like this issue
https://www.eevblog.com/forum/microcontrollers/32f417-32f437-auto-detect-of-extra-64k-ram/
is not related to having SIZE=192K with a 32F417.

Experimentally I can confirm that but I wondered if there was something else.

It does produce a confusing Build Analyser display and I just have to document that...
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #96 on: July 04, 2023, 11:41:23 am »
Yup.

If you want to order your functions in a specific way, you can use
    __attribute__ ((section (".text_NNNNN"))) function-definition
if, in the linker script, you replace input section
    *(.text*)
with the following three input sections:
    *(SORT(.text_*)) *(.text) *(SORT(.text*))

Being two input sections, for example
    EXCLUDE_FILE(*loader.o *boot.o) *(.text*)
becomes
    EXCLUDE_FILE(*loader.o *boot.o) *(SORT(.text_*))
    EXCLUDE_FILE(*loader.o *boot.o) *(.text)
    EXCLUDE_FILE(*loader.o *boot.o) *(SORT(.text*))

SORT() sorts files or sections by name.  If you use a nonnegative integer for NNNNN, make sure you use a fixed number of digits, because .text_09 < .text_10 < .text_9.

A nonnegative integer NNNNN is also compatible with compiling with gcc -ffunction-sections option, as gcc typically uses section names .text.functionname then.  (When compiling with -ffunction-sections, the linker can omit unused sections, and therefore unused functions, when --gc-sections is used.)

In the binary, the .text_NNNNN sections will be first, in sorted order of NNNNN.  If a section contains more than one symbol, those symbols are in random order.  Next come functions without the section attribute, compiled without -ffunction-sections.  Finally come functions without the section attribute, compiled with -ffunction-sections, sections in sorted order (by function name).

(Edited to add a missing [/tt] tag that wonkified the text in the post.)
« Last Edit: July 04, 2023, 09:49:18 pm by Nominal Animal »
 
The following users thanked this post: peter-h

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4591
  • Country: gb
  • Doing electronics since the 1960s...
Re: A question on GCC .ld linker script syntax
« Reply #97 on: July 04, 2023, 09:19:11 pm »
That's a really interesting capability.

What order would the linker normally use when following the linkfile, say within just one block of it?

If it does it in the order in which .o files are found in the directory (using the traditional first-first and find-next directory listing method) then it could be totally random.

But looking at the .map file it looks like it is sorting the stuff by full pathname, which is interesting because it is sorting across the whole directory tree

Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7527
  • Country: fi
    • My home page and email address
Re: A question on GCC .ld linker script syntax
« Reply #98 on: July 04, 2023, 10:06:39 pm »
The files should occur in the order they are specified in the link command line by default.  Perhaps Cube sorts them?

Remember, ld does not actually go look for any file names, it only consideres the file names specified to the command, and uses that order by default.

Section order, however, can be affected by --sort-section=name, which causes all input section names to be wrapped in SORT, i.e. any section name glob pattern to be expanded in sorted order.  It shouldn't affect file name ordering at all, though.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf