Electronics > Microcontrollers
STM 32F4 FPU registers and main() gotcha
peter-h:
I wonder why this
http://www.efton.sk/STM32/gotcha/g203.html
does not cause loads of trouble all over the place.
AIUI, it relates to C compilers treating the function main() differently when it comes to FPU stack operations. This is pretty weird, to be generating different code for a function, based on its name!
Maybe because I am using GCC (v11) and this version of GCC just happens to work i.e. does not emit the extra stack pushes/pops.
To work around this, the FPU enable code would need to go into the startup.s code i.e. before main() is entered. I am doing that but purely by accident; my startupxxx.s code called b_main() and that starts the FPU with
--- Code: --- // ========== This was in SystemInit() ============
#if (__FPU_PRESENT == 1) && (__FPU_USED == 1)
SCB->CPACR |= ((3UL << 10*2)|(3UL << 11*2)); /* set CP10 and CP11 Full Access */
#endif
--- End code ---
That code is commonly used Cube MX ("HAL") stuff which you find all over the internet...
FreeRTOS seems to do it again when it starts up (inside main() this time):
--- Code: --- /* Ensure the VFP is enabled - it should be anyway. */
vPortEnableVFP();
/* Lazy save always. */
*( portFPCCR ) |= portASPEN_AND_LSPEN_BITS;
and vPortEnableVFP() contains
/* This is a naked function. */
static void vPortEnableVFP( void )
{
__asm volatile
(
" ldr.w r0, =0xE000ED88 \n" /* The FPU enable bits are in the CPACR. */
" ldr r1, [r0] \n"
" \n"
" orr r1, r1, #( 0xf << 20 ) \n" /* Enable CP10 and CP11 coprocessors, then save back. */
" str r1, [r0] \n"
" bx r14 "
);
}
>
--- End code ---
Does this make sense to anyone? It seems to be working by accident, but it is a really weird thing as it is C compiler dependent, and to be sure you want to enable to FPU in the startup.s code.
wek:
--- Quote ---I wonder why this
http://www.efton.sk/STM32/gotcha/g203.html
does not cause loads of trouble all over the place.
--- End quote ---
Because main() usually does not contain FP operations, so usually the compiler does not need to stack FP registers.
Usually, main() consists only from a bunch of function calls. And, usually, those functions - especially if they handle FP - are located in separate files, thus are not subject to inlining.
Even with moderate FP usage within a function there's probably no stacking. I don't remember the details of the API, but are many FP registers, so probably some of them the callee don't need to preserve.
The problem happened to me because I don't write programs in the usual way, so quite a significant portion of my programs tend to be either explicitly, or inlined, in main() (I love spaghetti, and have and use a spaghetti-making machine).
--- Quote --- my startupxxx.s code called b_main() and that starts the FPU
--- End quote ---
b_main() is a C-function, and as such, it is vulnerable to the same problem, potential FP registers stacking - and it does not happen because of the same reason, you most probably have no FP operation in that function.
--- Quote ---
--- Code: ---/* This is a naked function. */
static void vPortEnableVFP( void )
--- End code ---
--- End quote ---
If it's naked indeed (i.e. there is somewhere a prototype with __attribute__((naked))), then there's no C prologue thus no registers stacking and no vulnerability of the kind described. However, the functions leading to calling that vPortEnableVFP() *are* vulnerable - but, again, FreeRTOS functions most probably have no FP operations in them.
JW
brucehoult:
--- Quote from: peter-h on July 22, 2024, 01:50:47 pm ---I wonder why this
http://www.efton.sk/STM32/gotcha/g203.html
does not cause loads of trouble all over the place.
--- End quote ---
Nothing STM or even Arm-specific in that.
If you're going to use an FPU (or vector unit, on ISAs / cores that have them) then you need to enable them before running a function that uses them, where "using" could involve arithmetic or, yes, storing or loading FPU registers.
You can perfectly well do that in main(), just as long as main() is running in privileged mode and doesn't itself use the FPU (etc) before initialising it -- including using it by saving registers in the prologue.
This will apply to anything that has an initially-disabled functional unit: It's certainly true on RISC-V (both FPU and Vector units, if present and used, need to be changed from "Off" to "Initial" or "Clean" in the mstatus.FS and mstasus.VS fields) and I'd imagine it is similar on x86, MIPS, PowerPC, ... too.
peter-h:
Thank you both.
I am certainly not using floats before enabling the FPU (which is done in b_main() which then does a long jump to main() which never returns) and I would hope that if I was, it would comprehensively not work :)
It is probably by accident that main() does not use floats currently. I do have some printf() debug calls in there (printf() being mapped to come out on the SWV ITM debug port) which output longs but not floats. If they were floats, would that matter? I am confused.
dietert1:
Today something similar happened when i worked on a small Win32 test app (network client).
There were no FPU operations in main(), but some in a thread started with CreateThread(). The app failed with "FPU not initialized" error. I solved the problem using _beginthreadex() instead and it worked. I learned that _beginthreadex() includes necessary CRT initializations.
Regards, Dieter
Navigation
[0] Message Index
[#] Next page
Go to full version