Author Topic: Understanding this piece of code from bootloader (Read 2210 times)

unscripted · « **on:** December 07, 2022, 12:34:39 am »

My question is mainly about C programming. Below is the snippet for what I wish to understand.

void (*pProgResetHandler)(void);

/* remap user program's vector table */
SCB->VTOR = CPU_USER_PROGRAM_VECTABLE_OFFSET & (uint32_t)0x1FFFFF80;
/* set the address where the bootloader needs to jump to. this is the address of
* the 2nd entry in the user program's vector table. this address points to the
* user program's reset handler.
*/
pProgResetHandler = (void(*)(void))(*((uint32_t *)CPU_USER_PROGRAM_STARTADDR_PTR));
/* the Cortex-M0+ core has interrupts enabled out of reset. the bootloader
* explicitly disables these for security reasons. Enable them here again, so it does
* not have to be done by the user program.
*/
CpuIrqEnable();
/* start the user program by activating its reset interrupt service routine */
pProgResetHandler();

Going line by line:

Code: [Select]

void (*pProgResetHandler)(void);
This is just declaring a pointer to a function taking no parameters and returning nothing. Am I right?

Code: [Select]

pProgResetHandler = (void(*)(void))(*((uint32_t *)CPU_USER_PROGRAM_STARTADDR_PTR));
(*((uint32_t *) means I want to treat CPU_USER_PROGRAM_STARTADDR_PTR as a pointer (casting it) then a deference it to access the memory through a pointer?

(void (*) (void)) says convert that value into a pointer to a function that takes no parameters and returns nothing.

Code: [Select]

pProgResetHandler();
This jumps to reset pointer address by reading or calling pProgResetHandler?

Am I OK in general? I feel I'm not using the right words or interpretation. This looks like advanced C, so I appreciate any help clarifying this.

westfw · « **Reply #1 on:** December 07, 2022, 12:56:27 am »

Quote

Am I OK in general?

Yep.

Quote

Code: [Select]
pProgResetHandler = (void(*)(void))(*((uint32_t *)CPU_USER_PROGRAM_STARTADDR_PTR));

Apparently CPU_USER_PROGRAM_STARTADDR_PTR is a pointer to a location where the start address is stored.
(0x4 on an ARM in general. VTOR+4 is you've changed the vector table address.)
So the rightmost part *((uint32_t *)CPU_USER_PROGRAM_STARTADDR_PTR) casts that value to a pointer and retrieves the actual start address. (I'm a bit surprised that it isn't already a uint32_t*, though.)
The left part of the statement (void(*)(void)) casts that value to be a function ptr.

The final call to the function pointer variable just treats the value as a code address and jumps to it.
It should compile to one fetch and one branch instruction; most of it is just telling the compiler how to interpret things...

Quote

This looks like advanced C

I dunno. I guess function pointers in general are relatively advanced.
This is more ... needlessly obfuscated code. A typedef would have helped.
(But perhaps it could have been worse, if CPU_USER_PROGRAM_STARTADDR_PTR had been cast to a pointer to pointer to code.)

ataradov · « **Reply #2 on:** December 07, 2022, 01:09:32 am »

This is an absurdly complicated version of this https://github.com/ataradov/bootloaders/blob/4fd26c0f4ce1bd2f07374b7f513ff22be3d202b7/firmware/usb_dfu_samd11/main.c#L85

Also note that this code is not correct, since it does not set the stack pointer to the one configured by the application. Application would just use bootloader stack.

I personally don't think that bootloader should change SCB->VTOR, but if it does, then this:

Quote

SCB->VTOR = CPU_USER_PROGRAM_VECTABLE_OFFSET & (uint32_t)0x1FFFFF80;

indicates misunderstanding of what is going on. "& (uint32_t)0x1FFFFF80" is redundant and does not do anything useful. If CPU_USER_PROGRAM_VECTABLE_OFFSET is not correctly aligned, then hard aligning it like this will not make it work any better. Those lower bits are not implemented and would be zero anyway.

redkitedesign · « **Reply #3 on:** December 07, 2022, 03:47:07 am »

Quote from: ataradov on December 07, 2022, 01:09:32 am

This is an absurdly complicated version of this https://github.com/ataradov/bootloaders/blob/4fd26c0f4ce1bd2f07374b7f513ff22be3d202b7/firmware/usb_dfu_samd11/main.c#L85

Then again, the original is just plain C. Your example requires knowledge of assembly.
It does have the advantage of the stack starting exactly where the app wanted it. The original will consume 4 bytes for the return address. G*d help you when those 4 bytes are relevant.

It's also specific for a single architecture, but that is hardly relevant for a bootloader.

I've implemented it in the past as:

void boot(bootloader_info_block* bib)
{
void (*app)(void) = (void(*)(void)) * ((uint32_t*)(bib->app_start+4));

// relocate vector table to start of app

SCB->VTOR = bib->app_start & SCB_VTOR_TBLOFF_Msk;

// reset stack pointer
__set_MSP(*(uint32_t*)bib->app_start);
// make the jump
app();
// should the app return (it shouldn't, but hey: Somebody else will write it)
// Reset and try again
NVIC_SystemReset();
}

ataradov · « **Reply #4 on:** December 07, 2022, 03:56:47 am »

Quote from: redkitedesign on December 07, 2022, 03:47:07 am

The original will consume 4 bytes for the return address. G*d help you when those 4 bytes are relevant.

Or it can break entirely if application does not put the stack at the end of SRAM, reserving some space for other storage, for example. Or bootloader is universal and written for the lowest memory size device, so on a bigger device it would end up in the middle of SRAM.

Quote from: redkitedesign on December 07, 2022, 03:47:07 am

SCB->VTOR = bib->app_start & SCB_VTOR_TBLOFF_Msk;

Once again, this & is completely pointless. If your app_start is not aligned already, you are screwed no matter what.

Quote from: redkitedesign on December 07, 2022, 03:47:07 am

__set_MSP(*(uint32_t*)bib->app_start);

Sure, but this is also almost assembly.

Ideally this set of the SP should be a part of the same assembly section. My code takes a risk with that too. If compiler decided to include any operations with the stack between __set_MSP() and the call, those operation would be done on the application stack. This is unlikely, but in the new code I do it as a single block for a piece of mind.

redkitedesign · « **Reply #5 on:** December 07, 2022, 04:36:41 am »

Quote from: ataradov on December 07, 2022, 03:56:47 am

Quote from: redkitedesign on December 07, 2022, 03:47:07 am
The original will consume 4 bytes for the return address. G*d help you when those 4 bytes are relevant.
Or it can break entirely if application does not put the stack at the end of SRAM, reserving some space for other storage, for example. Or bootloader is universal and written for the lowest memory size device, so on a bigger device it would end up in the middle of SRAM.

I was referring to using assembly for jumping to the application, versus using a function pointer.
Neither does anything to ensure the application gets the correct stack setup.

However, using a function pointer and calling takes 4 bytes from the applications stack, where-ever it is located.

Quote from: ataradov on December 07, 2022, 03:56:47 am

Ideally this set of the SP should be a part of the same assembly section. My code takes a risk with that too. If compiler decided to include any operations with the stack between __set_MSP() and the call, those operation would be done on the application stack. This is unlikely, but in the new code I do it as a single block for a piece of mind.

Both your code and my code use a local variable (which may be stack based) after changing the stack pointer.
In terms of risk that is very close to suicidal.

However, in both cases the variable is initialized with the value of a non-local, fixed address, and the variable is not changed anyhow so the compiler will probably refer to the original, non-local address when the variable is used.

But I've checked the assembly when I wrote it :-)

ataradov · « **Reply #6 on:** December 07, 2022, 04:49:52 am »

Quote

But I've checked the assembly when I wrote it :-)

So did I, but it bothers me. So, the new code just reads:

Code: [Select]

asm("msr MSP, %0; bx %1"::"r" (msp), "r" (reset_vector));

No need for __set_MSP() and the same result.

It is also helps with people inserting code between those lines during code extension. I've seen this happen in practice reviewing random code. Having it as one assembly section is way better.

cv007 · « **Reply #7 on:** December 07, 2022, 05:20:34 am »

Quote

However, using a function pointer and calling takes 4 bytes from the applications stack, where-ever it is located.

If using gcc you can use goto (gcc docs will explain its use and limitations)-

goto **(uint32_t*)(2048+4);

which will get you a jmp/bx/similar.

abyrvalg · « **Reply #8 on:** December 07, 2022, 03:02:39 pm »

BLX doesn’t push the return address onto the stack (it moves it to LR instead), it is the caller function prologue who probably does that. Use __attribute__((noreturn)) or __builtin_unreachable() to neutralize that.

bson · « **Reply #9 on:** December 13, 2022, 02:52:07 am »

Quote from: ataradov on December 07, 2022, 01:09:32 am

Quote
SCB->VTOR = CPU_USER_PROGRAM_VECTABLE_OFFSET & (uint32_t)0x1FFFFF80;
indicates misunderstanding of what is going on. "& (uint32_t)0x1FFFFF80" is redundant and does not do anything useful. If CPU_USER_PROGRAM_VECTABLE_OFFSET is not correctly aligned, then hard aligning it like this will not make it work any better. Those lower bits are not implemented and would be zero anyway.

Couldn't agree with this more!

Not only this, but I'd also add a static assert:

Quote

static_assert((CPU_USER_PROGRAM_VECTABLE_OFFSET & 0x1FFFFF80) == 0x1FFFFF80);
SCB->VTOR = CPU_USER_PROGRAM_VECTABLE_OFFSET;

This way, if CPU_USER_PROGRAM_VECTABLE_OFFSET has any other bits set it can't work no matter what, and I'd rather have the compiler tell me.

In fact I usually have a block of static asserts after including any config.h or param.h or such that contains constants like this. It both validates them and constitutes a documentation of sorts of what's expected. Put in something bad and I find out immediately.

Nominal Animal · « **Reply #10 on:** December 13, 2022, 11:20:12 am »

Quote from: ataradov on December 07, 2022, 01:09:32 am

I personally don't think that bootloader should change SCB->VTOR, but if it does, then this:
Quote
SCB->VTOR = CPU_USER_PROGRAM_VECTABLE_OFFSET & (uint32_t)0x1FFFFF80;
indicates misunderstanding of what is going on. "& (uint32_t)0x1FFFFF80" is redundant and does not do anything useful. If CPU_USER_PROGRAM_VECTABLE_OFFSET is not correctly aligned, then hard aligning it like this will not make it work any better. Those lower bits are not implemented and would be zero anyway.

Much better would be a comment stating the obvious, i.e.
/* Hardware only supports multiple of 128 CPU_USER_PROGRAM_VECTABLE_OFFSET, low 7 bits zero. */
SCB->VTOR = CPU_USER_PROGRAM_VECTABLE_OFFSET;

Stealing from the BUILD_BUG_ON_ZERO(expr) in the Linux kernel, one could also use
#define REQUIRE_ZERO(expr) ((void)(sizeof(struct { int:(-!!(expr)); })))
and
REQUIRE_ZERO(CPU_USER_PROGRAM_VECTABLE_OFFSET & 127);
SCB->VTOR = CPU_USER_PROGRAM_VECTABLE_OFFSET;
which would error out during build time, if CPU_USER_PROGRAM_VECTABLE_OFFSET is not a multiple of 128.

Best option, in my opinion, would be
/* Hardware only supports multiple of 128 CPU_USER_PROGRAM_VECTABLE_OFFSET, low 7 bits zero. */
BUILD_ASSERT((CPU_USER_PROGRAM_VECTABLE_OFFSET & 127) == 0, "CPU_USER_PROGRAM_VECTABLE_OFFSET is not aligned to a 128 byte boundary");
SCB->VTOR = CPU_USER_PROGRAM_VECTABLE_OFFSET;
where BUILD_ASSERT() is defined as
#ifdef __cplusplus
#define BUILD_ASSERT(expr, msg) static_assert(expr, msg)
#else
#define BUILD_ASSERT(expr, msg) _Static_assert(expr, msg)
#endif

static_assert() was standardized in C++11 and _Static_assert() in C11, and should be available in all current C and C++ compilers (for 32-bit ARM architectures). They generate no runtime code (and have no effect at runtime), they're purely a compile-time construct; and they require expr to be a constant the compiler can evaluate at compiler time.

Ground_Loop · « **Reply #11 on:** December 18, 2022, 04:25:01 pm »

Pardon my ignorance, but can someone explain what this means: (struct { int:(-!!(expr)); }))

Nominal Animal · « **Reply #12 on:** December 18, 2022, 04:51:28 pm »

Quote from: Ground_Loop on December 18, 2022, 04:25:01 pm

Pardon my ignorance, but can someone explain what this means: (struct { int:(-!!(expr)); }))

The expression sizeof (struct { int:(-!!(expr)); }) computes the size of the type (struct { int:(-!!(expr)); }).

This type is a structure with a single anonymous (unnamed) bitfield member, whose size in bits is -!!(expr).

The two ! are just two not operators back-to-back, often called not-not. !!x evaluates to 0 if x is 0 or false, and to 1 if x is nonzero or true.

Therefore, -!!(expr) evaluates to 0 if expr is zero or false, and to -1 otherwise.

Bitfields of width 0 are explicitly allowed by the C standard (the declaration of a bitfield is ignored if the width is zero), so when expr is zero or false, sizeof (struct { int:(-!!(expr)); }) evaluates to zero at build time; we can read it in English as "the size of a structure with no members".

However, when expr is nonzero, the bitfield width is -1, negative, and the C standard does not allow that, causing the compiler to abort compilation with an error.

Similarly, if expr is not a constant value at build time, the compiler cannot determine the width of the bitfield, and aborts the compilation.

Thus, it is a build-time "fail if this expression is not a compile-time zero" expression, that generates no code (because it evaluates to a simple integer that does not cause any code to be generated). Casting the entire expression using (void) just tells the compiler that it is okay to ignore that value, we just need to make sure it is computable.

Siwastaja · « **Reply #13 on:** December 18, 2022, 05:23:53 pm »

In other words,

Code: [Select]

#define REQUIRE_ZERO(expr) ((void)(sizeof(struct { int:(-!!(expr)); })))is just a weird trick to add a compile-time check in absence of _Static_assert before C11.

These tricks look absolutely ugly, which is why one often names them properly (like REQUIRE_ZERO here) and buries the actual #define in some misc.h or similar header, brushing the ugliness under the mat.

But it definitely does the job, and having such compile-time assert makes the actual code more robust against human error.

Same can be said about the usual

Code: [Select]

#define NUM_ELEM(tbl) (sizeof(tbl)/sizeof(tbl[0]))
The thing itself looks ugly and weird, but it's highly useful and makes C behave a bit more high-level and more abstracted language when used extensively, reducing human workload and copy-pasta errors when one can iterate through an array (fixed size or VLA), or check for out-of-bounds access, even if the array size changes, without having to define some stupid #define MY_ARRAY_LEN and remembering to use it everywhere.

Only their implementations are "advanced C". I would like to see beginners use these sort of things more, though, as it will improve the code readability and maintainability, and reduce bugs. It's OK to copypaste the ugly implementations without understanding exactly how they work, as long as you know how and why to use them.

voltsandjolts · « **Reply #14 on:** December 18, 2022, 05:44:37 pm »

Quote from: Nominal Animal on December 13, 2022, 11:20:12 am

where BUILD_ASSERT() is defined as
#ifdef __cplusplus
#define BUILD_ASSERT(expr, msg) static_assert(expr, msg)
#else
#define BUILD_ASSERT(expr, msg) _Static_assert(expr, msg)
#endif

I could be missing something here, but I don't think that's needed.
C11 defines the static_assert macro already for C++ compatibility.

Quote

7.2 Diagnostics <assert.h>
The header <assert.h> defines the assert and static_assert macros and ...

ataradov · « **Reply #15 on:** December 18, 2022, 06:00:19 pm »

Yes, starting with C11 you can use standard static assert functionality. They are slowly cleaning it up.

Nominal Animal · « **Reply #16 on:** December 18, 2022, 06:54:33 pm »

Quote from: voltsandjolts on December 18, 2022, 05:44:37 pm

Quote from: Nominal Animal on December 13, 2022, 11:20:12 am
where BUILD_ASSERT() is defined as
#ifdef __cplusplus
#define BUILD_ASSERT(expr, msg) static_assert(expr, msg)
#else
#define BUILD_ASSERT(expr, msg) _Static_assert(expr, msg)
#endif

I could be missing something here, but I don't think that's needed.
C11 defines the static_assert macro already for C++ compatibility.

Quote
7.2 Diagnostics <assert.h>
The header <assert.h> defines the assert and static_assert macros and ...

I assumed we are talking about freestanding code here. In C, that includes only header files <float.h>, <iso646.h>, <limits.h>, <stdalign.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, <stdint.h>, and <stdnoreturn.h>. So, <assert.h> may or may not be available. However, _Static_assert() will be, if the compiler supports C11 or later.

The fact that the macro definition may look ugly as all hell, doesn't matter. What matters is that it's side effects are documented (none, here, for BUILD_ASSERT() or REQUIRE_ZERO() –– see the standard for the former; for the latter, because the expression is basically a sizeof expression, no side effects in the expression will be applied: something like REQUIRE_ZERO(--x) will not work, x will not be decremented), and that they are properly used to confirm build-time requirements if and when applicable.

Again, most important thing is to document the assumptions and requirements. I agree with ataradov's opinion on run-time checking such stuff, and consider the utmost priority to be to document all requirements and assumptions, with a weak preference on build-time tests verifying the ones that can be easily verified without runtime overhead, and avoiding run-time tests in embedded software. Things like "does this architecture use IEEE-754 Binary32 as float?" are nontrivial to check (because there are architectures that seem to, but do not support subnormals), so documenting the requirements should always be the priority; and the build-time checks just trivial "hey, you forgot this" -type helpers for complex projects.

(In fully hosted environments, I prefer separate run-time test suites to included run-time tests; but, for highly portable code, letting the user know whoever built the binaries did it wrong, can sometimes be worth the overhead.)

cv007 · « **Reply #17 on:** December 19, 2022, 04:03:05 am »

A better place for an address such as this is in the linker script. The flash region size should be set to the bootloader size so you do not produce code that exceeds the bootloader allotted size, and from that you can create a symbol for the app address (which unless you are doing something unusual, will already be 128bit aligned using only the flash region size to come up with the app start symbol). If/when your bootloader size needs changing, you go to the linker script for the change, not a define. The app will also have the flash region set to the correct origin/size in its linker script.

https://godbolt.org/z/c175aE9hd

All a moot point if you cannot get the bootloader and app to use the same value.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Understanding this piece of code from bootloader (Read 2210 times)

unscripted

Understanding this piece of code from bootloader

westfw

Re: Understanding this piece of code from bootloader

ataradov

Re: Understanding this piece of code from bootloader

redkitedesign

Re: Understanding this piece of code from bootloader

ataradov

Re: Understanding this piece of code from bootloader

redkitedesign

Re: Understanding this piece of code from bootloader

ataradov

Re: Understanding this piece of code from bootloader

cv007

Re: Understanding this piece of code from bootloader

abyrvalg

Re: Understanding this piece of code from bootloader

bson

Re: Understanding this piece of code from bootloader

Nominal Animal

Re: Understanding this piece of code from bootloader

Ground_Loop

Re: Understanding this piece of code from bootloader

Nominal Animal

Re: Understanding this piece of code from bootloader

Siwastaja

Re: Understanding this piece of code from bootloader

voltsandjolts

Re: Understanding this piece of code from bootloader

ataradov

Re: Understanding this piece of code from bootloader

Nominal Animal

Re: Understanding this piece of code from bootloader

cv007

Re: Understanding this piece of code from bootloader

Share me