Author Topic: 32F4 hard fault trap - how to track this down?  (Read 9281 times)

0 Members and 1 Guest are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
32F4 hard fault trap - how to track this down?
« on: July 22, 2022, 07:40:23 pm »


The stack trace is just nonsense - CPU executing invalid opcodes. But the 0xFFFFFFED is a clue, and it googles to various things, but I don't see a systematic method. One guy found it by lighting up an LED and moving the point along his code until it stayed lit at the trap. In my case I have loads of code - about 300k.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline thm_w

  • Super Contributor
  • ***
  • Posts: 6389
  • Country: ca
  • Non-expert
Re: 32F4 hard fault trap - how to track this down?
« Reply #1 on: July 22, 2022, 08:35:32 pm »
Look at the call stack, was any specific function occurring or is it a different function each time.

This might give some clues: https://github.com/ferenc-nemeth/arm-hard-fault-handler
Profile -> Modify profile -> Look and Layout ->  Don't show users' signatures
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11262
  • Country: us
    • Personal site
Re: 32F4 hard fault trap - how to track this down?
« Reply #2 on: July 22, 2022, 10:52:35 pm »
It is a good idea to not rely on the IDE for stack trace.

Here is an example of an instrumented HF handler:
Code: [Select]
//-----------------------------------------------------------------------------
void irq_handler_hard_fault_c(uint32_t lr, uint32_t msp, uint32_t psp)
{
  uint32_t s_r0, s_r1, s_r2, s_r3, s_r12, s_lr, s_pc, s_psr;
  uint32_t r_CFSR, r_HFSR, r_DFSR, r_AFSR, r_BFAR, r_MMAR;
  uint32_t *sp = (uint32_t *)((lr & 4) ? psp : msp);

  s_r0  = sp[0];
  s_r1  = sp[1];
  s_r2  = sp[2];
  s_r3  = sp[3];
  s_r12 = sp[4];
  s_lr  = sp[5];
  s_pc  = sp[6];
  s_psr = sp[7];

  r_CFSR = SCB->CFSR;  // Configurable Fault Status Register (MMSR, BFSR and UFSR)
  r_HFSR = SCB->HFSR;  // Hard Fault Status Register
  r_DFSR = SCB->DFSR;  // Debug Fault Status Register
  r_MMAR = SCB->MMFAR; // MemManage Fault Address Register
  r_BFAR = SCB->BFAR;  // Bus Fault Address Register
  r_AFSR = SCB->AFSR;  // Auxiliary Fault Status Register

  asm("nop"); // Setup breakpoint here

  while (1);
}

//-----------------------------------------------------------------------------
__attribute__((naked)) void irq_handler_hard_fault(void) // Rename to whatever you have in the vector table
{
  asm volatile (R"asm(
    mov    r0, lr
    mrs    r1, msp
    mrs    r2, psp
    b      irq_handler_hard_fault_c
    )asm"
  );
}

Once you get to the C handler, you will have all the saved context. Interesting values here are s_lr, since it would contain last valid call that was performed. You can follow the code (assembly) from there and them correlate the values in the saved registers with what the code was likely doing.

And the values of all those xxxSR registers would tell you the nature of the fault.
Alex
 
The following users thanked this post: peter-h, laneboysrc, harerod

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #3 on: July 23, 2022, 06:21:11 am »
The above shows syntax errors



This is an example of my asm code which is ok



but I am reluctant to go hacking yours in case you did something for a reason :)

Making the obvious changes produces loads of warnings

« Last Edit: July 23, 2022, 06:25:08 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11262
  • Country: us
    • Personal site
Re: 32F4 hard fault trap - how to track this down?
« Reply #4 on: July 23, 2022, 06:27:35 am »
What syntax errors? Does the compiler complain or just the IDE? You are too reliant on IDEs.

Your IDE may not understand this style of strings. You can change to whatever it understands as long as the code remains the same.

I see the edit with the error message. What version of the compiler do you have?

Ok, raw strings seem to be an extension for now, so you need -std=gnu99 or -std=gnu11 passed to the compiler. Or rewrite the strings without the raw stuff.

Warnings are normal, those variables are not used. This is excerpt from the code that used to print them, but you can just look at them in the debugger.
« Last Edit: July 23, 2022, 06:32:07 am by ataradov »
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #5 on: July 23, 2022, 07:03:04 am »
Those are compiler errors.

GCC v10.

Apologies; I find the ARM asm syntax impenetrable. I programmed in asm for decades but this is something else :)

I would be grateful for any help. I think a part of it is that one can quote each line, or a whole section of asm, but I can't get anything to compile.

This is nearer, I think:

Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11262
  • Country: us
    • Personal site
Re: 32F4 hard fault trap - how to track this down?
« Reply #6 on: July 23, 2022, 07:10:10 am »
You need  \n at the end of all those new strings. That's why I used raw strings, so I won't have to do that nonsense myself.

ARM syntax is one of the easiest to understand. I'm not sure what is so hard about it.
Alex
 
The following users thanked this post: newbrain

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #7 on: July 23, 2022, 07:43:27 am »
You are right of course; I need to read my own code more carefully :)

I also set optimisation to -O0 for this function, otherwise some of the variables don't show up in the debugger.

Code: [Select]
// This was originally used to print out those registers
__attribute__((optimize("O0")))
void irq_handler_hard_fault_c(uint32_t lr, uint32_t msp, uint32_t psp)
{
  uint32_t s_r0, s_r1, s_r2, s_r3, s_r12, s_lr, s_pc, s_psr;
  uint32_t r_CFSR, r_HFSR, r_DFSR, r_AFSR, r_BFAR, r_MMAR;
  uint32_t *sp = (uint32_t *)((lr & 4) ? psp : msp);

  s_r0  = sp[0];
  s_r1  = sp[1];
  s_r2  = sp[2];
  s_r3  = sp[3];
  s_r12 = sp[4];
  s_lr  = sp[5];
  s_pc  = sp[6];
  s_psr = sp[7];

  r_CFSR = SCB->CFSR;  // Configurable Fault Status Register (MMSR, BFSR and UFSR)
  r_HFSR = SCB->HFSR;  // Hard Fault Status Register
  r_DFSR = SCB->DFSR;  // Debug Fault Status Register
  r_MMAR = SCB->MMFAR; // MemManage Fault Address Register
  r_BFAR = SCB->BFAR;  // Bus Fault Address Register
  r_AFSR = SCB->AFSR;  // Auxiliary Fault Status Register

  asm("nop"); // Setup breakpoint here

  while (1);
}

// Rename this to whatever you have in the vector table
__attribute__((naked)) void HardFault_Handler(void)
{
//asm volatile (R"asm(
asm volatile (
    "mov    r0, lr \n"
    "mrs    r1, msp \n"
    "mrs    r2, psp \n"
    "b      irq_handler_hard_fault_c \n"
    //)asm"
  );
}

I looked for ways to make the Cube stack trace longer but can't find anything. The stack trace length varies anyway.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1719
  • Country: se
Re: 32F4 hard fault trap - how to track this down?
« Reply #8 on: July 23, 2022, 08:18:57 am »
The stack trace is just nonsense - CPU executing invalid opcodes. But the 0xFFFFFFED is a clue, and it googles to various things, but I don't see a systematic method.

https://developer.arm.com/documentation/ddi0403/d/System-Level-Architecture/System-Level-Programmers--Model/ARMv7-M-exception-model/Exception-return-behavior?lang=en
Nandemo wa shiranai wa yo, shitteru koto dake.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #9 on: July 23, 2022, 08:29:38 am »
Yes; it is something to do with the FPU, but (if so) how?

Now that I have put in the extra debug above, the target has decided to run for longer ;)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 495
  • Country: sk
Re: 32F4 hard fault trap - how to track this down?
« Reply #10 on: July 23, 2022, 08:42:02 am »
Just don't get fixated to a particular value. Look at the stack as thm_w said. You can also post it for us to chew on it, together with content of the processor registers.

Btw. if you want some other particular value to analyze, then 0x656d6974, that's "time" or "emit" :)

JW
 

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1719
  • Country: se
Re: 32F4 hard fault trap - how to track this down?
« Reply #11 on: July 23, 2022, 08:46:51 am »
Yes; it is something to do with the FPU, but (if so) how?
Got to a real keyboard now, so here's some basic info:
0xFFFFFFED is a magic value that's loaded in the lr register when an exception is entered - in a regular subroutine call the current pc would be loaded there instead.
When a return is executed (e.g., a branch on the value of lr) the magic value indicates that this is not a regular subroutine return but an exception return.
In a Cortex-M with an FP extension, the stack frame that's automatically saved entering the exception might (Extended) or might not (Basic) contain FP registers.
As the table states, 0xFFFFFFED means: Thread mode,   use Process stack pointer and restore an Extended stack frame.
Nandemo wa shiranai wa yo, shitteru koto dake.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #12 on: July 23, 2022, 10:21:53 am »
It is still running :)

But I had an idea, relating to the extended stack frame: read somewhere that the printf family uses the heap for the %f implementation. Clearly that would be a disaster for thread safety under an RTOS. It is library dependent and I can't find GCC-specific info but if true this would definitely be a vulnerability, due to a) malloc and free being definitely not thread-safe (I can mutex them, having made sure none are used prior to mutexes having become available which is quite late in my main() and b) the heap is a stupid idea anyway when uses that way due to fragmentation.

I can modify all instances of %f to output two integers etc.

EDIT: I found malloc() in the library. No source but I set a breakpoint on the FLASH address. An ignore count of 5 gets me past known usage. Then I don't see any calls to it despite using stuff like %7.3f in printfs.
« Last Edit: July 23, 2022, 10:42:37 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #13 on: July 23, 2022, 11:33:30 am »
It finally bombed:

Code: [Select]
lr uint32_t 0xffffffed (Hex)
msp uint32_t 0x2001ffe0 (Hex)
psp uint32_t 0x100079f0 (Hex)
s_r0 uint32_t 0x200060e8 (Hex)
s_r1 uint32_t 0x200060ec (Hex)
s_r2 uint32_t 0x64 (Hex)
s_r3 uint32_t 0x5c28f5c9 (Hex)
s_r12 uint32_t 0xa5a5a5a5 (Hex)
s_lr uint32_t 0xa5a5a5a5 (Hex)
s_pc uint32_t 0xa5a5a5a5 (Hex)
s_psr uint32_t 0xa5a5a5a5 (Hex)
r_CFSR uint32_t 0x60000000 (Hex)
r_HFSR uint32_t 0x20000 (Hex)
r_DFSR uint32_t 0x40000000 (Hex)
r_AFSR uint32_t 0x0 (Hex)
r_BFAR uint32_t 0xe000ed34 (Hex)
r_MMAR uint32_t 0x9 (Hex)
sp uint32_t * 0x100013dc (Hex)

r0 0xffffffed (Hex)
r1 0x2001ffe0 (Hex)
r2 0x100079f0 (Hex)
r3 0x0 (Hex)
r4 0x804beac (Hex)
r5 0x3e8 (Hex)
r6 0x20002478 (Hex)
r7 0x2001ff88 (Hex)
r8 0x801ce69 (Hex)
r9 0xa5a5a5a5 (Hex)
r10 0xa5a5a5a5 (Hex)
r11 0xa5a5a5a5 (Hex)
r12 0xa0000000 (Hex)
sp 0x2001ff88 (Hex)
lr 0xffffffed (Hex)
pc 0x8041d42 (Hex)
xpsr 0x21000003 (Hex)
d0 0x0 (Hex)
d1 0x0 (Hex)
d2 0x0 (Hex)
d3 0x0 (Hex)
d4 0x0 (Hex)
d5 0x0 (Hex)
d6 0x0 (Hex)
d7 0x0 (Hex)
d8 0x0 (Hex)
d9 0x0 (Hex)
d10 0x0 (Hex)
d11 0x0 (Hex)
d12 0x0 (Hex)
d13 0x0 (Hex)


SP at 0x2001ff88 is reasonable (my general stack is 2001e000-2001ffff).
MSP is the same as above. The CPU switches SP to MSP or PSP, automatically.
PSP at 0x100079f0 is in one of the RTOS stacks (RTOS uses the 64k CCM at 10000000-1000ffff) but I need to restart the target to find out which task it belongs to (I have a graphical display of the CCM block, with a mouseover display of the address and data). But before I restart the target, someone here may have a suggestion to do something else, so I won't restart it yet.

Thank you for any pointers.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 495
  • Country: sk
Re: 32F4 hard fault trap - how to track this down?
« Reply #14 on: July 23, 2022, 12:09:26 pm »
I said, look at the *stack*, not just stack pointer. You're about to enter hard stuff. You may find dissecting 300kLOC to be a viable option.

-----

> printf() using malloc() [thus not reentrant]

It may come as a nasty surprise, but by the C standard, *no* library function is reentrant. No, not even abs().

Some versions of printf() may use internal heap, but that's not reentrant either.

https://nadler.com/embedded/newlibAndFreeRTOS.html

Generally, printf() and kin have no place in mcu. If you want to use them, you'll pay all the price, including the hidden portions.

JW
« Last Edit: July 23, 2022, 12:11:08 pm by wek »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #15 on: July 23, 2022, 12:46:14 pm »
Here is the stack above 2000FF88



I see that FFFFFFED value on it. Addresses 100xxxxx are within the RTOS stacks and addresses 200xxxxx are obviously main RAM.

The only FLASH address I see is 8014610 and that is within the RTOS code (port.c) although the .map file shows nothing specific at that address



and it doesn't look like anything I recognise.

Code: [Select]
          prvPortStartFirstTask:
080145f0:   ldr     r0, [pc, #32]   ; (0x8014614)
080145f2:   ldr     r0, [r0, #0]
080145f4:   ldr     r0, [r0, #0]
080145f6:   msr     MSP, r0
080145fa:   mov.w   r0, #0
080145fe:   msr     CONTROL, r0
08014602:   cpsie   i
08014604:   cpsie   f
08014606:   dsb     sy
0801460a:   isb     sy
0801460e:   svc     0
08014610:   nop     
282       }
08014612:   movs    r0, r0
08014614:           ; <UNDEFINED> instruction: 0xed08e000
708        __asm volatile
          vPortEnableVFP:
08014618:   ldr.w   r0, [pc, #12]   ; 0x8014628
0801461c:   ldr     r1, [r0, #0]

But then I don't know how to interpret that stack frame. It is automated.

I did see that Nadler site but a breakpoint on malloc doesn't reveal any heap usage at all, after the known calls (5 of them after startup). GCC printf doesn't call a malloc for sure. It may have an internal heap... In years past, I saw some usage of statics which would obviously not be thread safe.

« Last Edit: July 23, 2022, 12:55:03 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 825
  • Country: es
Re: 32F4 hard fault trap - how to track this down?
« Reply #16 on: July 23, 2022, 01:43:43 pm »
Check 080148A8 (a saved PC/LR of a Thumb code always has bit 0 set. The 08014610 you’ve spotted is just some pointer).
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #17 on: July 23, 2022, 01:54:21 pm »
Nothing in the .map file for 080148A8 but here is the content



That is within FreeRTOS.

I have been digging around to see how the arm32 stack is filled up and can't find a clear description, so I don't know what to make of 100079f0 (which I am not looking yet because I would need to restart the target and then all the values may change). This is what is in there (data, not code)



That fact that SP was in the general stack, not within one of the RTOS stacks (0x10000000+) tells me that this was an ISR which did it, because the CPU switches to the general stack for interrupts. My ISRs should all be in main RAM (0x20000000+).
« Last Edit: July 23, 2022, 02:00:16 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline harerod

  • Frequent Contributor
  • **
  • Posts: 449
  • Country: de
  • ee - digital & analog
    • My services:
Re: 32F4 hard fault trap - how to track this down?
« Reply #18 on: July 23, 2022, 02:50:09 pm »
ataradov, thank you for that snippet. I am looking forward to using this.
I added this to my existing HardFault-handler. In my production code the while(1) becomes replaced by a define, which is either an eternal loop for debugging or an immediate system restart request for production use. I also added "__attribute__((unused))".
Adjusted for CubeIde 1.3, your examples might read like:

Code: [Select]
// put this function in Vector Table
__attribute__((naked)) void HardFault_Handler_asm(void)
{
  asm(
    "mov    r0, lr\n"
    "mrs    r1, msp\n"
    "mrs    r2, psp\n"
    "b      HardFault_Handler\n"
  );
}

// c-handler with breakpoint
void HardFault_Handler_c(uint32_t lr, uint32_t msp, uint32_t psp)
{
  __attribute__((unused)) uint32_t s_r0, s_r1, s_r2, s_r3, s_r12, s_lr, s_pc, s_psr;
  __attribute__((unused)) uint32_t r_CFSR, r_HFSR, r_DFSR, r_AFSR, r_BFAR, r_MMAR;
  uint32_t *sp = (uint32_t *)((lr & 4) ? psp : msp);

  s_r0  = sp[0];
  s_r1  = sp[1];
  s_r2  = sp[2];
  s_r3  = sp[3];
  s_r12 = sp[4];
  s_lr  = sp[5];
  s_pc  = sp[6];
  s_psr = sp[7];

  r_CFSR = SCB->CFSR;  // Configurable Fault Status Register (MMSR, BFSR and UFSR)
  r_HFSR = SCB->HFSR;  // Hard Fault Status Register
  r_DFSR = SCB->DFSR;  // Debug Fault Status Register
  r_MMAR = SCB->MMFAR; // MemManage Fault Address Register
  r_BFAR = SCB->BFAR;  // Bus Fault Address Register
  r_AFSR = SCB->AFSR;  // Auxiliary Fault Status Register

  asm("nop"); // Setup breakpoint here

  while(1);
}
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 495
  • Country: sk
Re: 32F4 hard fault trap - how to track this down?
« Reply #19 on: July 23, 2022, 02:50:44 pm »
>  I don't know how to interpret that stack frame.

The first two rows are - as you've already used them - R0, R1, R2, R3, R12, LR, PC, xPSR, as they were stacked by the fault handler.

If you have FPU enabled, then next 4 rows are the FPU registers (or just space for them if lazy stacking is on, which probably is) and one more word is PFSCR. There may be an aligner, too, see CCR.STKALIGN.

The rest is what was at the stack at the moment when the fault happened.

This is a post-mortem status, nowhere it is said that it's useful, and also it may be full of red herrings, except that it's all you have. The svc instruction in snippet you posted causes the SVC exception, which should stack also the 0x08014610 as PC at that point, but that would be the on the bottom of stack only if FPU would not be used, so that's confusing. I'd have a look at the SVC handler, too, just for the fun. Yes I know that's the heart of the RTOS. I don't use RTOS and have exactly zero experience debugging it or debugging within it.

"Normally", with "simple errors", the fault happens so that PC points to the last "correct" offending instruction. Your PC points to 0x00208CEA. I don't know how that area behaves, probably traps, so it must've been a jump to that address immediately before, but we of course have no trace of where it jumped from. The problem with post-mortem analysis is, that you can't walk backwards (plus runaway program sometimes destroys evidence, too).

What is strange is also content of CFSR and HFSR registers you've given above, they are completely nonsense.

JW
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 495
  • Country: sk
Re: 32F4 hard fault trap - how to track this down?
« Reply #20 on: July 23, 2022, 02:57:38 pm »
Coincidence?

As I've said, I know nothing about RTOS.

JW
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #21 on: July 23, 2022, 03:07:21 pm »
SVC handler:

Code: [Select]
static void prvTaskExitError( void )
{
volatile uint32_t ulDummy = 0;

/* A function that implements a task must not exit or attempt to return to
its caller as there is nothing to return to.  If a task wants to exit it
should instead call vTaskDelete( NULL ).

Artificially force an assert() to be triggered if configASSERT() is
defined, then stop here so application writers can catch the error. */
configASSERT( uxCriticalNesting == ~0UL );
portDISABLE_INTERRUPTS();
while( ulDummy == 0 )
{
/* This file calls prvTaskExitError() after the scheduler has been
started to remove a compiler warning about the function being defined
but never called.  ulDummy is used purely to quieten other warnings
about code appearing after this function is called - making ulDummy
volatile makes the compiler think the function could return and
therefore not output an 'unreachable code' warning for code that appears
after it. */
}
}
/*-----------------------------------------------------------*/

void vPortSVCHandler( void )
{
__asm volatile (
" ldr r3, pxCurrentTCBConst2 \n" /* Restore the context. */
" ldr r1, [r3] \n" /* Use pxCurrentTCBConst to get the pxCurrentTCB address. */
" ldr r0, [r1] \n" /* The first item in pxCurrentTCB is the task top of stack. */
" ldmia r0!, {r4-r11, r14} \n" /* Pop the registers that are not automatically saved on exception entry and the critical nesting count. */
" msr psp, r0 \n" /* Restore the task stack pointer. */
" isb \n"
" mov r0, #0 \n"
" msr basepri, r0 \n"
" bx r14 \n"
" \n"
" .align 4 \n"
"pxCurrentTCBConst2: .word pxCurrentTCB \n"
);
}
/*-----------------------------------------------------------*/

static void prvPortStartFirstTask( void )
{
/* Start the first task.  This also clears the bit that indicates the FPU is
in use in case the FPU was used before the scheduler was started - which
would otherwise result in the unnecessary leaving of space in the SVC stack
for lazy saving of FPU registers. */
__asm volatile(
" ldr r0, =0xE000ED08 \n" /* Use the NVIC offset register to locate the stack. */
" ldr r0, [r0] \n"
" ldr r0, [r0] \n"
" msr msp, r0 \n" /* Set the msp back to the start of the stack. */
" mov r0, #0 \n" /* Clear the bit that indicates the FPU is in use, see comment above. */
" msr control, r0 \n"
" cpsie i \n" /* Globally enable interrupts. */
" cpsie f \n"
" dsb \n"
" isb \n"
" svc 0 \n" /* System call to start first task. */
" nop \n"
);
}

I am at the limit of my knowledge here, but if I can find the address of the code which resulted in this trap I can put in breakpoints around that.

That 79F0 address is within the TCP/IP RTOS task, which is completely unsurprising, but the code for that is in the FLASH. The stuff at 79F0 is just a data+RTOS stack area. For example if a function running under an RTOS declares a variable, that variable, being stack-based as normal, will end up in this area.

I can leave this for a bit, otherwise I can restart it and see if it ends up in the same place.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4078
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: 32F4 hard fault trap - how to track this down?
« Reply #22 on: July 23, 2022, 03:36:36 pm »
Do you know what type of error caused it yet?
Have you check the fault analyzer in CubeIDE yet, it will tell you exactly what everyone here is trying to access via the complicated assembly.

Relying on IDE's too much lol.
« Last Edit: July 23, 2022, 03:38:21 pm by Jeroen3 »
 
The following users thanked this post: thm_w

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F4 hard fault trap - how to track this down?
« Reply #23 on: July 23, 2022, 04:12:58 pm »



0x100079f0 is within one of the RTOS stacks areas, for task "TCP/IP". This is the whole RTOS stack space for that task (FreeRTOS fills its entire workspace (a sort of heap actually) with A5).



Interestingly at 79F0 is 0x00000000, which looks like it overwrote the tail end of a MEM_SYS text string which was there before (which may not be relevant).

If this is repeatable I can do a watchpoint on 79F0 and 0x00000000.

The "unused stack" areas are much bigger than shown; I cropped it of necessity.
« Last Edit: July 23, 2022, 04:15:26 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4078
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: 32F4 hard fault trap - how to track this down?
« Reply #24 on: July 23, 2022, 04:38:56 pm »
Have you enabled "halt on exception" in the debug startup settings yet? Then it breaks at the exact instruction causing the fault, with the context intact.
You can then look at window with the function call tracing, I forgot the name, how you got to that point, and what pointers are used to get there.
You should be able to click back in the that trace to see more context of those functions and if any pointers go to something they shouldn't. If that trace is gibberish, you've smashed the stack.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf