Author Topic: Porting stm32 to gigadevice mcu: weird c++ hard fault issue [Solved] (Read 4240 times)

seabea · « **on:** January 17, 2022, 04:39:46 pm »

Hello !

We have application allready in production running on a STM32F469Bit6.
Now we port it to a gigadevice gd32f450zk mcu.

We experience problems with the c++ application code which goes to hard faults at "random" positions on the gd32 mcu.

The hal (hardwareabstraction layer) is running and clock, sdram, ext flash, controller area network, lcd and trace is working

We checked the timing on a analog output with the oscilloscope, checked the pin connections with the miroscope and used two different boards. a hardware error is unlikely.

we have testprograms for doing all sdram and iram checks, allready went back to the slowest timing possible and checked the connections on the (homemade) board -
everything seems ok and c code is running without such problems.
heap is properly mapped, allocate and new operator return valid pointers.
iram is also working of course. some code runs from the extflash but we also checked this with a test application:
no errors detected here and additionally c++ code may still produce hard faults when linked to iram.

malloc and new is all we use in the c++ code where allready the hard faults occur.
(No initialization of external interrupts or peripherals)

the c++ application code goes to a hard fault sporadically at different programm position
that happens during predictable initialization where no asychnronous external events occur, no interrupts are enabled - just allocating and initializing straight forward.

rebuilding the app with a stripped down object oriented model could not reproduce the errors till now.

we're a bit clueless at the moment and we hope it's our fault but we can not figure out what we do wrong...

The hard fault status register states a bus data error but the error only occurs within the c++ application and not if we test the rams with test routines. (see attachment)

Quote

IMPRECISERR is set when a data bus error has occurred, but the return address in the stack frame is not related to the instruction that caused the error.
FORCED is set to indicate a forced HardFault, generated by escalation of a fault with configurable priority that cannot be handled, either because of priority or because it is disabled

see https://www.keil.com/dd/vtr/4889/8687.htm

anyone faced such issues with c++ on gigadevice on keil environment?

kind regards + hello to the forum members!

PS: The hardfault code is (see attachment)
The ADDR is always different, like mentioned above.

We're on the latest 2.0.0 device support package
compiler version 1.6.3 also up to date.
still keil 5.18a is a bit out of date.

bson · « **Reply #1 on:** January 17, 2022, 06:38:01 pm »

It's a data access bus fault, and the return address on the stack refers to the instruction that made the faulting access. Look at the instruction and see what it's trying to access - the problem is with the target of that access, not the instruction itself.

thm_w · « **Reply #2 on:** January 17, 2022, 10:59:10 pm »

You know GD32F450ZK is 256K sram. STM32F469BG is 384kB (with CCM), 200MHz vs 180MHz
Probably some major differences in the peripherals as well.

How much porting effort was actually done here? You are using STM32 HAL?

seabea · « **Reply #3 on:** January 18, 2022, 11:24:38 am »

Quote from: bson on January 17, 2022, 06:38:01 pm

It's a data access bus fault, and the return address on the stack refers to the instruction that made the faulting access. Look at the instruction and see what it's trying to access - the problem is with the target of that access, not the instruction itself.

Thanks. I know about this process. The problem is that the faulting adress is different each time i run the code.

In one state I saw in the debugger, it is accessing a target in an area where is no memory at all (0xe.....)

Quote from: thm_w on January 17, 2022, 10:59:10 pm

You know GD32F450ZK is 256K sram. STM32F469BG is 384kB (with CCM), 200MHz vs 180MHz
Probably some major differences in the peripherals as well.

How much porting effort was actually done here? You are using STM32 HAL?

Yes, i know. It think it was the best match we could get in purchase due to delivery problems.

We use the Stm32Hal for the stm only.

Most effort i did so far was to refacture an abstraction for the hal adapter. For the GD32 we of course use the GD32-Hal-Functions, not the Stm32Hal.

My collegue just stated, that on his tests the programm does not enter the main function unless he removes all c++ code...
It seems the problem is exclusively related to c++ code.

seabea · « **Reply #4 on:** January 18, 2022, 03:55:11 pm »

Ok, i have a track: The code executed in external flash produces these errors.

the same code executed in irom works.

So i think the flash timing is not properly set for external flash. I will check this tommorow with my collegue..

It seems just strange to me, that this error was not detected by the write/read tests...

hans · « **Reply #5 on:** January 20, 2022, 10:29:22 am »

If those read/write tests are sequential, then they may not catch them. Normal code execution has a lot of random access, hence why we have caches (accelerators) with wait states on FLASH memories.
Did you get around to fixing this? Was it actually caused by some flash access error?

seabea · « **Reply #6 on:** January 20, 2022, 04:37:03 pm »

We yesterday found the problem and solution.

It seems you can not use a shared access for data and code for the external memory bus.

This means for example, if you have code in ext flash you can not access variables in sdram because sdram and extflash are in external memory.

My collegue tested many constellations and it seems this rule applies. There was no statement in the gd32 reference and a erata sheet doesnt exist at all (!).

So the solution is to place all code in the irom. No code in the external memory. Then the application memory can be located in extram (SDRAM) and some pictures etc. can be in ExtFlash.

We will get a show-stopper if the IROM of 2 MB is not enough for the application code and we are quite close to that.

This is a evil traphole and the documentation does not indicate this problem.

seabea · « **Reply #7 on:** March 31, 2022, 01:43:08 pm »

Hi guys,

i don't want to keep the misery secret that followed during our further development:

Memory Controller Problem

- We found a problem in the memory controller which sometimes has problems to switch between different external memories. To be exact, writing to (external) sdram resulted in wrong values if writing to dsram was done shortly after reading the data from external flash. In my reproducer-test-routine just the insertion of a if-statement in between made the error disappear. So it seems certain instructions recover the memorycontroller from it s bad state. All this is why we can't use the external flash. Gigadevice confirmed this and sayd it should be cleared in the next mcu (guess it was the Gd32F470)

As stated formerly we first used the extflash for images and font data. The result was sporadic white stripes in the fonts and bitmap data. Sometimes the bitmap was completely distorted, propably in the case the io errors affected the bitmap header. Every reset we got a different result for the startup image. So we moved all to irom and made different builds with different language packages. Then it worked.

Display Controller Problem

- We now found another problem with the display controller. We had it clocked too slow at 20mhz. Result was a small flickering. As our production approval team claimed about that we found the error and set the clock to specified speed of 33mhz. Then the image was totally distorted.

We found 2 things:
a) The electric signal was distorted. (Overswing and high ramp times) This may cause the distroted image and be caused by our hardware, maybe just by the lab installation)
b) What is not caused by our hardware is, that the length of one line on the electric signal is longer than we set in the display controller registers (TLI_SPSZ, TLI_BPSZ TLI_ASZ and TLI_TSZ)
This seems to be another problem with the mcu respectively the display controller.

BTW we could not verify that the Typicial clockspeed of the display 33.3 mhz is supported by the display controller as - in contrast to the stm32 - we found no statement about the supproted clock rate of the displaycontroller in the datasheet.

Conclusion

All in all i'm not too satisfied with the controller. I think it is legit to reduce some features in a clone but there seem to be major errors and as far as i was told, GD does not supply an erata sheet at all. The reason can only be supposed ;-)

Furthermore they only wanted to grant support when using their hal layer which was additional effort for us since we configured the mcu registers directly. After we where able to verify the problem with the hal layer of gd32 we where told it was a known problem. (with the memory controller)

However, cheaper unit prices often means more development effort but this one is a unique experience.

Let's hope, with the ?Gd32F470? it'll be better.

seabea · « **Reply #8 on:** March 31, 2022, 02:12:16 pm »

Quote from: hans on January 20, 2022, 10:29:22 am

Normal code execution has a lot of random access, hence why we have caches (accelerators) with wait states on FLASH memories.
Did you get around to fixing this? Was it actually caused by some flash access error?

Hi Hans,

well i must say, i just have limited understanding of the cashes, waitstates and in detail what users need to do to deal with it. But the collegue who wrote the drivers is aware of it and i think he configured the wait states correctly. Is there anything the application needs to call to deal with this? Can the problems i described above be related to some missing flush calls or so? As stated we read from ext flash and wrote to ext sdram (by using static const c-variables linked to ext flash and dynamically allocated memory of the sdram) and the result contained errors. For me it is quite stable assumption, the problem is with the mcu itself.

To answer the question: We somehow found workarounds nearly for all the problems. (Like i described above)

PS: Now i get you: You meant it's the reason why we did not catch the error with our test-routines. Good point. Actually the consent was a bit like "We tested it and it has to work so it's an application problem" ;-)


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Porting stm32 to gigadevice mcu: weird c++ hard fault issue [Solved] (Read 4240 times)

seabea

Porting stm32 to gigadevice mcu: weird c++ hard fault issue [Solved]

bson

Re: Porting stm32 to gigadevice mcu: weird c++ hard fault issue

thm_w

Re: Porting stm32 to gigadevice mcu: weird c++ hard fault issue

seabea

Re: Porting stm32 to gigadevice mcu: weird c++ hard fault issue

seabea

Re: Porting stm32 to gigadevice mcu: weird c++ hard fault issue

hans

Re: Porting stm32 to gigadevice mcu: weird c++ hard fault issue

seabea

Re: Porting stm32 to gigadevice mcu: weird c++ hard fault issue [Solved]

seabea

Re: Porting stm32 to gigadevice mcu: weird c++ hard fault issue [Solved]

seabea

Re: Porting stm32 to gigadevice mcu: weird c++ hard fault issue

Share me