Author Topic: Bare-metal ARM, gcc, position-independent code -- possible?  (Read 3716 times)

0 Members and 1 Guest are viewing this topic.

Offline eutectiqueTopic starter

  • Frequent Contributor
  • **
  • Posts: 396
  • Country: be
Bare-metal ARM, gcc, position-independent code -- possible?
« on: March 01, 2022, 06:11:14 pm »
Hi,

I would very much appreciate an advice regarding building position-independent applications with gcc.

Background

A device has a bootloader and two slots for applications, say A and B. The bootloader starts, chooses which app to run, and jumps into it.

An application should have position-independent code, it is not known in advance which slot it will occupy -- the firmware update process can place it in either A or B. Data should have absolute addresses. So, the whole thing should be read-only position-independent, ROPI.

Slot size is 400 kB. (For the record, current app size is about 230 kB)

Compiler is arm-none-eabi-gcc v9.3.1, the app runs in FreeRTOS. The project contains about 100 source files, some of them are vendor's HAL and BSP, but the majority are ours.

The build

  • CFLAGS += -fPIC -mno-pic-data-is-text-relative -msingle-pic-base -mpic-register=r9
  • LFLAGS += -fPIC
  • rebuild Newlib from source with CFLAGS
  • fix FreeRTOS to not trash R9 register
  • build the app for slot A
  • in start-up code:
    • copy GOT from flash to RAM and relocate code addresses (don't touch data addresses)
    • setup R9 to point to GOT
    • copy vector table from flash to RAM, relocate addresses, setup VTOR
    • copy .data section to RAM, clean .bss section
    • call main()
Place the app into slot A and enjoy.

The problem

The app fails when placed into slot B. The source of the problem is data structures initialised with function addresses like this:

Quote

extern int bspR1BoardGetMcuUid(void);
extern int bspR1BoardGetBoardType(void);
extern int bspR1BoardGetSerialNumber(void);

halBoard_t halBoard = {
    .bspGetMcuUid       = bspR1BoardGetMcuUid,
    .bspGetBoardType    = bspR1BoardGetBoardType,
    .bspGetSerialNumber = bspR1BoardGetSerialNumber,
};

driverBoard_t driverBoard = {
    .hal = &halBoard,
};

driverBoardSetup(&driverBoard);

halBoard is initialised as follows:
Quote

halBoard_t halBoard = {
 0000C6A6   LDR.W   R3, =0x000000D4
 0000C6AA   LDR.W   R3, [R9, R3]
 0000C6AE   LDM.W   R3, {R0-R2}       ; <--- R3=10001E00, points to .data section
 0000C6B2   ADD.W   R3, SP, #0x1F20
 0000C6B6   ADDS    R3, #8
 0000C6B8   STM.W   R3, {R0-R2}
driverBoard_t driverBoard = {
 ........
driverBoardSetup(&driverBoard);
 0000C6D4   MOV     R0, R2
 0000C6D6   BL      driverBoardSetup 
 ........

_sdata
 10001E00   DC32    0x000126B1      ; bspR1BoardGetMcuUid
 10001E04   DC32    0x0003AE99      ; bspR1BoardGetBoardType
 10001E08   DC32    0x0003AEA5      ; bspR1BoardGetSerialNumber
 10001E0C   DC32    0x00000000
 10001E10   DC32    0x10001E28      ; networkBands.15358
 10001E14   DC32    0x00000003
 ........

So, absolute addresses are placed into .data section (first three words) and then used to initialise the structure.


The pages suggested by google include:

https://community.arm.com/support-forums/f/compilers-and-libraries-forum/44805/cannot-build-position-independent-code-that-works -- Yes, but the resulting shared object is bloated 3 times, from original 230kB to 650kB. It will not fit the slot.

https://mcuoneclipse.com/2021/06/05/position-independent-code-with-gcc-for-arm-cortex-m -- Yes, but the manual creation of PLT or other Linuxy stuff is out of question.

And lots of others, without a clear answer.

The question

Has anyone successfully built a ROPI application:
  • with gcc and relevant command-line options (which?),
  • without manual veneers or other crutches,
  • with a reasonable size?

How?

gcc is not that strict requirement. If it's a wrong tool, would Clang do the job?

Thank you for hints, pointers, etc.
« Last Edit: March 01, 2022, 06:20:13 pm by eutectique »
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14507
  • Country: fr
 
The following users thanked this post: harerod

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11277
  • Country: us
    • Personal site
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #2 on: March 01, 2022, 07:17:41 pm »
In the past I tried to create reasonable PIC code using GCC, and I could not make it work. You very quickly run into needing GOT and all associated overhead.

I was looking at it for OTA firmware updates where the image would be located in one part of the flash or the other. And a more reasonable solution I found is to build two images linked at different addresses. OTA image includes both, and the device selects which one to program based on the side that is currently running (pick the opposite of the running). The other half is discarded or is not downloaded in a first place.
« Last Edit: March 01, 2022, 07:20:27 pm by ataradov »
Alex
 
The following users thanked this post: eutectique, harerod

Offline eutectiqueTopic starter

  • Frequent Contributor
  • **
  • Posts: 396
  • Country: be
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #3 on: March 01, 2022, 09:32:58 pm »
Quote from: SiliconWizard
You may want to refer to this thread: https://www.eevblog.com/forum/microcontrollers/how-to-create-elf-file-which-contains-the-normal-prog-plus-a-relocatable-block/

It's a bit away from what I am after, but thanks anyway!

( :palm: 4 pages of discussion of the problem which can be solved with one line of the linker script? Ok, two lines. And the second line to the source code.)
« Last Edit: March 01, 2022, 09:49:07 pm by eutectique »
 

Offline eutectiqueTopic starter

  • Frequent Contributor
  • **
  • Posts: 396
  • Country: be
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #4 on: March 01, 2022, 09:47:05 pm »
Quote from: ataradov
I was looking at it for OTA firmware updates where the image would be located in one part of the flash or the other.

Yes, that's exactly why I want PIC.

Looks like two images would be the viable solution. Thank you!
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 825
  • Country: es
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #5 on: March 01, 2022, 10:06:51 pm »
You can do it a different way:
- save OTA to slot B (always)
- reboot to bootloader
- validate slot B image
- copy it to slot A (if valid)
- invalidate slot B
- jump to slot A
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11277
  • Country: us
    • Personal site
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #6 on: March 01, 2022, 10:15:15 pm »
Having two images allows rollback and some applications may need this. It is also faster to switch images if you don't have to switch them, just boot the right one.

But on the other hand managing two images complicates the release a bit.
Alex
 

Offline eutectiqueTopic starter

  • Frequent Contributor
  • **
  • Posts: 396
  • Country: be
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #7 on: March 01, 2022, 10:26:48 pm »
You can do it a different way:
- save OTA to slot B (always)
- reboot to bootloader
- validate slot B image
- copy it to slot A (if valid)
- invalidate slot B
- jump to slot A

Yep, that's how we do it now.
 

Offline bson

  • Supporter
  • ****
  • Posts: 2271
  • Country: us
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #8 on: March 01, 2022, 10:37:11 pm »
Looks like a compiler or possibly linker bug.  Maybe try gcc 10 for good measure?

Maybe it's mistakenly assuming a function pointer (which refers to a pure section) stored in an impure section is impure.  Only pure data and code is relative to the PIC base register; the SRAM is in the same location regardless of which bank you use.  If so, perhaps you can declare the vector table  'const' to make it pure?  This will of course add another address calculation, but it might work at least.

Also, what code is generated for the calls?
« Last Edit: March 01, 2022, 10:40:32 pm by bson »
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 825
  • Country: es
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #9 on: March 03, 2022, 12:21:26 pm »
Another option is to wrap all code ptrs with some adjustment func/macro that will add a precalculated displacement.
But I'm sure I've seen many ARM binaries full of constructs like this:
Code: [Select]
LDR R0, =PcRelativeOffset
ADD R0, PC
used instead of hardcoded addresses, the question is which compiler produces them.
 

Offline peter-h

  • Super Contributor
  • ***
  • Posts: 3708
  • Country: gb
  • Doing electronics since the 1960s...
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #10 on: March 03, 2022, 09:26:09 pm »
Quote
4 pages of discussion of the problem which can be solved with one line of the linker script?

You must be extremely clever, because (that was "my" thread) I never found an elegant solution, and AFAICT nobody had ever used the GCC relocatable code option.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline bson

  • Supporter
  • ****
  • Posts: 2271
  • Country: us
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #11 on: March 04, 2022, 04:01:42 pm »
Thinking about this some more, the vector table as declared can't be correctly made PIC.  The pointers you initialize it with are only that - initializers.  When emitting code for the call, the compiler has no idea if the pointer actually found is a PIC offset or absolute.  Either is a perfectly reasonable use case; the latter for copying all sorts of performance-critical code to an SRAM bank and storing function pointers somewhere in a table to access it, in which case adding the base register to the addresses would be completely wrong.  So the first step absolutely should be to make the vector table itself const, and then make sure it's in flash (don't put anything immutable in SRAM unless it's performance critical).

I think the bug here is the compiler or linker doesn't issue a warning, that an impure function pointer initialized with a pointer to a pure section is inherently incompatible with PIC.
 

Offline eutectiqueTopic starter

  • Frequent Contributor
  • **
  • Posts: 396
  • Country: be
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #12 on: March 04, 2022, 07:09:53 pm »
Solution:

Initialise function pointers at run time. Instead of
Code: [Select]
halBoard_t halBoard = {
    .bspGetMcuUid       = bspR1BoardGetMcuUid,
    .bspGetBoardType    = bspR1BoardGetBoardType,
    .bspGetSerialNumber = bspR1BoardGetSerialNumber,
};

do the following:

Code: [Select]
halBoard_t halBoard = {0};
halBoard.bspGetMcuUid       = bspR1BoardGetMcuUid;
halBoard.bspGetBoardType    = bspR1BoardGetBoardType;
halBoard.bspGetSerialNumber = bspR1BoardGetSerialNumber;

In the first case the compiler generates code to load addresses from .data section. In the second case it generates code to load addresses from GOT, exactly what I want.

Tested with gcc v{9,10}, and Clang v13. The results are the same.
« Last Edit: March 04, 2022, 07:53:31 pm by eutectique »
 

Offline eutectiqueTopic starter

  • Frequent Contributor
  • **
  • Posts: 396
  • Country: be
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #13 on: March 04, 2022, 07:15:06 pm »
Quote from: bson
Thinking about this some more, the vector table as declared can't be correctly made PIC.

True. It contains absolute addresses. That's why I copy it to RAM and adjust.
 

Offline eutectiqueTopic starter

  • Frequent Contributor
  • **
  • Posts: 396
  • Country: be
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #14 on: March 04, 2022, 07:43:52 pm »
Quote from: peter-h
You must be extremely clever, because (that was "my" thread) I never found an elegant solution, and AFAICT nobody had ever used the GCC relocatable code option.

Won't argue with this, thank you. I've added my two pence to that discussion, hope it's helpful.
 

Offline eutectiqueTopic starter

  • Frequent Contributor
  • **
  • Posts: 396
  • Country: be
Re: Bare-metal ARM, gcc, position-independent code -- possible?
« Reply #15 on: March 04, 2022, 07:47:05 pm »
Quote from: abyrvalg
Another option is to wrap all code ptrs with some adjustment func/macro that will add a precalculated displacement.

Yes, also a possibility, but I wanted to avoid it at all costs.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf