Author Topic: 32F417 / 32F437 auto detect of extra 64k RAM  (Read 6940 times)

0 Members and 1 Guest are viewing this topic.

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
32F417 / 32F437 auto detect of extra 64k RAM
« on: June 28, 2023, 04:05:56 pm »
I have all this working, and tweaked things like _sbrk()

Code: [Select]
// This is used by malloc().
// The original Newlib version of this, on which the ST code was based
// https://github.com/zephyrproject-rtos/zephyr/blob/main/lib/libc/newlib/libc-hooks.c
// allowed the heap to go all the way up to the current SP value, which is stupid.
// This one sets the limit at the base (lowest memory address) of the stack area.
// Also see init_heap() in main.c.

caddr_t _sbrk(int incr)
{

// These two are defined in the linkfile but _top is varied according to g_dev_id
extern char end asm("_end"); // end of BSS
extern char top asm("_top"); // base of the general stack

extern uint32_t g_dev_id;

int extra_ram = 0;

if (g_dev_id == 437)
{
extra_ram += (64 * 1024);
}

static char *heap_end; // this gets initialised to NULL by C convention
char *prev_heap_end; // this gets initialised on 1st call here

// This sets heap_end to end of BSS, on the first call to _sbrk
if (heap_end == NULL)
heap_end = &end;

prev_heap_end = heap_end;

// top = top of RAM minus size of stack
if ( (heap_end + incr) > (&top + extra_ram) )
{
errno = ENOMEM; // not apparently used by anything
prev_heap_end = (char*) -1;
}
else
{
heap_end += incr;
}

//debug_thread_printf("malloc sbrk, incr=%d, ret=%08x",(int)incr, (int) prev_heap_end);

return (caddr_t) prev_heap_end;

}

where

Code: [Select]
// Get device ID and set up global ID value
if ( HAL_GetDEVID() == 0x413 )
g_dev_id=417;
if ( HAL_GetDEVID() == 0x419 )
g_dev_id=437;

but there is a problem: I cannot do e.g.

uint8_t fred[63*1024];

i.e. a static global variable cannot access the extra 64 because the linkfile still contains the 32F417 memory mapping

Code: [Select]
MEMORY
{
FLASH_BOOT (rx)     : ORIGIN = 0x08000000, LENGTH = 32K
FLASH_APP (rx)      : ORIGIN = 0x08008000, LENGTH = 1024K-32K
RAM (xrw)           : ORIGIN = 0x20000000, LENGTH = 128K
RAM_L (xrw)         : ORIGIN = 0x2001f000, LENGTH = 4K
CCMRAM (rw)         : ORIGIN = 0x10000000, LENGTH = 64K
}

i.e. I would need to also change LENGTH = 128K to LENGTH = 192K. The error comes from the linker.

The obvious hack is to have LENGTH = 192K in the linkfile anyway but then the Build Analyser will be misleading if the CPU is a 417. No way around this, I guess



I just wonder how others have done it.

I really do not want to have two Cube IDE projects or different builds. And changing the CPU type in Cube (one needs to do it in more than one place, and that is just the ones I know about!) is a can of worms which would need a massive amount of regression testing.

It is of course obvious that the Cube Build Analyser won't know the CPU type :)

The 32F437 is believed by experts to be an exact superset of the 32F417, with just one exception: the Vbat measurement using ADC1 (/4 or /2 resistor divider) and FWIW I have verified this over years of development work (I have both CPUs in various boards).

Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 6540
  • Country: es
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #1 on: June 28, 2023, 05:11:25 pm »
Why don't simply make a 417/407 firmware?
You'll have to program them anyways!
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 16288
  • Country: fr
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #2 on: June 28, 2023, 07:09:36 pm »
I would personally write a linker script which defines the base common RAM area (and associate all 'common' data sections to it), and define a separate RAM area for the extra 64KB, and define a dedicated section for it.

Then whatever data in C that you define that must be in this extra 64KB will have to be declared using a section attribute.

I suppose you may have wanted something more "transparent", but I don't really see how you can do this safely.
 

Offline thm_w

  • Super Contributor
  • ***
  • Posts: 8046
  • Country: ca
  • Non-expert
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #3 on: June 28, 2023, 10:28:15 pm »
Why don't simply make a 417/407 firmware?
You'll have to program them anyways!

Yeah, although I assume this can be done under the same project, eg have a "407 release" and a "417 release", with build flags selecting which linker script to use.
Profile -> Modify profile -> Look and Layout ->  Don't show users' signatures
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 28733
  • Country: nl
    • NCT Developments
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #4 on: June 28, 2023, 10:32:52 pm »
I'd put the heap into the extra memory and allow to allocate more memory if there is more ram. Put all static allocations that are fixed regardless of memory size in the memory size that is common. If there needs to be a 'static' array with a size that changes according to the amount of memory, then it should be allocated from the heap at the start and never released.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #5 on: June 29, 2023, 05:51:19 am »
Quote
Why don't simply make a 417/407 firmware?
You'll have to program them anyways!

No need to change Cube to program a 417 or 437 :) It's a good Q why this works but it does. It does show the CPU type during the firmware load via SWD.

Quote
I assume this can be done under the same project, eg have a "407 release" and a "417 release", with build flags selecting which linker script to use.

Where would the build flags be set? Cube IDE has a fantastic amount of config options, which I totally want to avoid changing. There is a way to formally set the CPU type also, and I've documented this in the project design doc, but it seems to be necessary in several places.

Quote
I'd put the heap into the extra memory and allow to allocate more memory if there is more ram. Put all static allocations that are fixed regardless of memory size in the memory size that is common. If there needs to be a 'static' array with a size that changes according to the amount of memory, then it should be allocated from the heap at the start and never released.

That's clever, especially as the main driver of needing more RAM is TLS which gets a 48k block (within which it runs its private heap). But I actually have a working product with TLS and with the 417; it just has only about 10k free RAM.

My current code looks like this. I fill in a global variable with 417 or 437 (decimal) based on the CPU ID (0x413 or 0x419) and if it is a 437 I move certain operations (in this case filling the stack area with "S" - there are actually very few such operations) higher up by 64k. The syntax for picking up symbol values from the linkfile into C is bizzare but it is what is widely used

Code: [Select]
// Get CPU type into B_g_dev_id

if ( B_HAL_GetDEVID() == 0x413 )
B_g_dev_id=417;
if ( B_HAL_GetDEVID() == 0x419 )
B_g_dev_id=437;

// Fill stack with "S". This used to be in the startup .s code.
// We do it here because it is easier to grab the CPU ID in C code (above)

// We are filling the stack area but the stack is used to call B_memset :)
// So we just reduce the filled length by 64 bytes. Only about 20 is needed.
// The syntax needed to pick up the symbols is weird; also used in _sbrk().

extern char _top;
char* stack_base; // base of stack (from linkfile)
stack_base = &_top;
extern char _Stack_Size; // size of stack (from linkfile)
char* stack_size;
stack_size = &_Stack_Size;

if (B_g_dev_id==437)
{
stack_base += (64*1024); // 32F437 has stack 64k higher up
}
B_memset((char*)stack_base,'S',(int)stack_size-64);

What would be helpful would be a prominent build-time warning if the yellow value below



falls below 64k. I have no idea how to do that. It would say something like "WARNING: RAM usage needs a 32F437".

It needs to come from the linkfile, and is all this lot

Code: [Select]

/* Initialized data sections for rest of the unit. These go into RAM, load LMA copy after code */
/* This stuff is copied from FLASH to RAM by C code in the main stub */
.all_nonboot_data :
  {
    . = ALIGN(4);
    _s_nonboot_data = .;        /* create a global symbol at data start */
    *(.data .data*)      /* .data sections */
      . = ALIGN(4);
    _e_nonboot_data = .;        /* define a global symbol at data end */
  } >RAM  AT >FLASH_APP

  /* used by the main stub C code to initialize data */
  _si_nonboot_data = LOADADDR(.all_nonboot_data);

 
  /* Uninitialized data section for rest of unit */
/* This stuff is zeroed by C code in the main stub */
  . = ALIGN(4);
  .all_nonboot_bss :
  {
      _s_nonboot_bss = .;          /* define a global symbol at bss start */
    *(.bss .bss* .COMMON .common .common*)
    . = ALIGN(4);
    _e_nonboot_bss = .;          /* define a global symbol at bss end */
  } >RAM
 
« Last Edit: July 03, 2023, 08:09:27 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline thm_w

  • Super Contributor
  • ***
  • Posts: 8046
  • Country: ca
  • Non-expert
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #6 on: June 29, 2023, 09:39:14 pm »
Where would the build flags be set? Cube IDE has a fantastic amount of config options, which I totally want to avoid changing. There is a way to formally set the CPU type also, and I've documented this in the project design doc, but it seems to be necessary in several places.

I don't use cube but it should be a standard feature in any IDE to have multiple build configurations.

Seems shown on page 61 here: https://www.st.com/resource/en/user_manual/um2609-stm32cubeide-user-guide-stmicroelectronics.pdf
-T "STM32F401RETX_FLASH.ld"
Profile -> Modify profile -> Look and Layout ->  Don't show users' signatures
 
The following users thanked this post: peter-h

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #7 on: June 29, 2023, 10:03:51 pm »
I have a strange problem.

It is OK to have the 1st vector in the VTOR vector table pointing to an initial stack value of ram base + 128k (0x20020000), but having SP itself pointing to ram base + 192k (0x20030000), at the point when interrupts are initially set up and enabled?

Code: [Select]
Vectors2:
  .word  _estack  // 0x20020000
  .word  main
  .word  NMI_Handler
  .word  HardFault_Handler
  .word  MemManage_Handler
etc

IOW, does the 32F4 interrupt system read that first vector table value for some purpose?

I know FreeRTOS bloody well does read it, hence the "adds" instruction there

Code: [Select]
static void prvPortStartFirstTask( void )
{
/* Start the first task.  This also clears the bit that indicates the FPU is
in use in case the FPU was used before the scheduler was started - which
would otherwise result in the unnecessary leaving of space in the SVC stack
for lazy saving of FPU registers. */

extern uint32_t g_dev_id;

if (g_dev_id==437)
{

__asm volatile(
" ldr r0, =0xE000ED08 \n" /* Use the NVIC offset register to locate the stack. */
" ldr r0, [r0] \n"
" ldr r0, [r0] \n"
" adds r0, #65536 \n" // adjust for stack being 64k higher up
" msr msp, r0 \n" /* Set the msp back to the start of the stack. */
" mov r0, #0 \n" /* Clear the bit that indicates the FPU is in use, see comment above. */
" msr control, r0 \n"
" cpsie i \n" /* Globally enable interrupts. */
" cpsie f \n"
" dsb \n"
" isb \n"
" svc 0 \n" /* System call to start first task. */
" nop \n"
);

The code is bombing in quite a drastic way :)

My vector table is in FLASH because that is safer, but if this was some huge problem it could be in RAM and then I can have the right value actually in the table.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 28733
  • Country: nl
    • NCT Developments
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #8 on: June 29, 2023, 10:08:21 pm »
Where would the build flags be set? Cube IDE has a fantastic amount of config options, which I totally want to avoid changing. There is a way to formally set the CPU type also, and I've documented this in the project design doc, but it seems to be necessary in several places.

I don't use cube but it should be a standard feature in any IDE to have multiple build configurations.
CubeIDE aka Eclipse supports multiple build configurations just fine. It is the stuff that ST has bolted on that loses track if you are going to target different microcontrollers. But then again, for any serious work you'll want to avoid relying on what ST has bolted onto Eclipse.  >:D
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9683
  • Country: fi
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #9 on: June 30, 2023, 05:40:32 am »
IOW, does the 32F4 interrupt system read that first vector table value for some purpose?

AFAIK, no. The first two entries are read out just for SP and PC initialization after actual chip reset signal (power up reset or software reset through NVIC_SystemReset())
 
The following users thanked this post: peter-h

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #10 on: June 30, 2023, 08:26:21 am »
Quote
CubeIDE aka Eclipse supports multiple build configurations just fine.

Sure, but I really want a single firmware to support boards with a 417 or a 437. There are huge production, support, maintenance, etc advantages.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9683
  • Country: fi
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #11 on: June 30, 2023, 08:40:39 am »
Stupid question: what kind of application benefits from more RAM? Unlike desktop computing where you might benefit from being able to open more Firefox tabs with more RAM, or make Photoshop faster by reducing disk swapping, usually microcontroller projects have fixed functional constraints, i.e., you need to do X, Y and Z. If the smaller RAM is enough to do that, why would you need to do anything different on the larger part? Just leave excess address space unused. Like, you don't need to initialize stack pointer with the end of RAM.

And, if you end up adding feature Å only on larger RAM parts, easiest way to do that is to add another section after the stack and put large static buffers etc. used by feature Å there. No need to do anything compile-time - use the larger RAM in linker script and let the compiler place that section on address space that is illegal on the smaller part. As the feature is turned off, those variables are never accessed and everything's fine.
« Last Edit: June 30, 2023, 08:50:22 am by Siwastaja »
 

Offline hans

  • Super Contributor
  • ***
  • Posts: 1727
  • Country: 00
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #12 on: June 30, 2023, 09:27:32 am »

falls below 64k. I have no idea how to do that. It would say something like "WARNING: RAM usage needs a 32F437".


That should not be a warning but an error.

The worst thing is to detect a fault, but not tell it's severity to a programmer. Memory stack issues can lead to strange cases. I once had to debug a bootloader + application issue with a colleague. We spent the best part of 1-2 weeks on tracing it down. When the application stack(s) grew too much, the application would crash. Turns out we had some mess going on with stack pointers in our bootloader.

I agree with Siwastaja. Stacks can be put at any arbitrary location, as FreeRTOS will do, but by default the stack of main() is at the "end of RAM" (that it knows about). I read that you're using a heap, which is commonly not best practice on embedded because of fragmentation issues if its constantly used for reallocations. More memory would alleviate some memory pressure of the heap, but if that is necessary due to instable software, then that is bad design.

Ideally on embedded you allocate all buffers statically. If you know the processor at compiletime, perhaps you could resize some buffers to handle bigger bursts if that is a necessary wish. But other than that I don't really see the point.
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #13 on: June 30, 2023, 11:57:06 am »
Quote
what kind of application benefits from more RAM?

Hard to explain without describing the product but to a large extent I am right now doing work which will be a helluva lot easier to do now than if somebody wants it in a year's time when I've forgotten half the stuff.

I can tell you I have enough RAM on the 417 (about 55k spare) unless somebody invokes TLS which mallocs a 48k block... There is a possible graphic LCD applicatio; biggest LCD with SPI is not that big but big enough, and graphics libs like RAM.

But right now I have broken this box comprehensively. I suspect the linkfile syntax does not allow /* comments */ in some places  |O I am seeing the CPU vanishing into hyperspace, but the binary seems ok byte for byte. Completely weird.

Quote
That should not be a warning but an error.

Absolutely!

Quote
Stacks can be put at any arbitrary location, as FreeRTOS will do, but by default the stack of main() is at the "end of RAM" (that it knows about). I read that you're using a heap, which is commonly not best practice on embedded because of fragmentation issues if its constantly used for reallocations. More memory would alleviate some memory pressure of the heap, but if that is necessary due to instable software, then that is bad design.

Ideally on embedded you allocate all buffers statically. If you know the processor at compiletime, perhaps you could resize some buffers to handle bigger bursts if that is a necessary wish. But other than that I don't really see the point.

I am using the 64k CCM for FreeRTOS stacks, so that bit is isolated. But FR still picks up the _estack word from the vector table, for reasons too convoluted for my pay grade. But I handle that bit too, now, adding 64k to the value retrieved.

The heap is only for TLS (see above) and it is organised so it can never fragment (the block is freed when the TLS session ends, etc. This has been tested over a couple of years. I hate heap stuff too but sometimes it is right for optional features, and in e.g. 55k it is impossible to fragment with a 48k malloc. One could statically alloc that 48k, sure, but you waste 48k if TLS is not used.

I see a lot of value in production in a single firmware edition, covering multiple product variants. For example I have ARINC429 as an expensive factory option, and the chip is auto detected at startup, and an RTOS task for it is started.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #14 on: June 30, 2023, 07:19:37 pm »
Quote
The first two entries are read out just for SP and PC initialization after actual chip reset signal (power up reset or software reset through NVIC_SystemReset())

Does msp come into any of this? FreeRTOS uses it, but the preceeding code doesn't appear to set it.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 304
  • Country: ua
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #15 on: July 02, 2023, 10:43:52 pm »
FreeRTOS provides  several different malloc implementations.
One of them is heap5.c, which can use several non-adjacent memory blocks.
https://www.freertos.org/a00111.html#heap_5
Open source embedded network library https://mongoose.ws
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #16 on: July 03, 2023, 06:52:42 am »
Currently I am using heap_4 and this has been working perfectly - like all of FreeRTOS actually.

For the moment I've had to give up on the auto detect of the extra 64k. I've put in in a #define so I can move on with the project.

There is a mystery problem, which is intermittent and hard to track down. It crashes the target, and bizzarely Cube IDE needs to be restarted also!

It looks like one can't just load SP to _estack (the symbol of the last RAM location + 1 i.e. 0x20020000 for a 32F417; this load also happens at the very startup which is the asm startupxxx.s code) and then add 64k to it if the 437 is detected

Code: [Select]
// Get CPU type and store it

        uin32_t B_g_dev_id=0;

if ( B_HAL_GetDEVID() == 0x413 )
B_g_dev_id=417;
if ( B_HAL_GetDEVID() == 0x419 )
B_g_dev_id=437;

// Load SP to top of *real* RAM per CPU type

asm volatile ("ldr sp, = _estack \n");
#ifdef EXTRA_64K
if (B_g_dev_id==437)
{
asm volatile ("ldr sp, = _estack+65536 \n");
}
#endif

// Fill stack with "S". This used to be in the startup .s code.
// We do it here because it is easier to grab the CPU ID in C code (above)

// ldr r2, = 0x2001e000  /*   = _estack - _Stack_Size  */
// b LoopFillStack
//FillStack:
// movs r3, 0x53535353   /* fill with 'S' */
// str  r3, [r2]
// adds r2, r2, #4
//LoopFillStack:
// ldr r3, = _estack     /* = 0x20020000 */
// cmp r2, r3
// bcc FillStack

// We are filling the stack area but the stack is used to call B_memset :)
// So we just reduce the filled length a bit. Only a few are needed.
// The syntax needed to pick up the symbols is weird; also used in _sbrk().

extern char _top;
char* stack_base; // base of stack (from linkfile)
stack_base = &_top;
extern char _Stack_Size; // size of stack (from linkfile)
char* stack_size;
stack_size = &_Stack_Size;

#ifdef EXTRA_64K
if (B_g_dev_id==437)
{
stack_base += 65536; // 32F437 has stack 64k higher up
}
#endif

B_memset((char*)stack_base,'S',(int)stack_size-256);

void B_main_real(void);
B_main_real();
        for(;;);

It could be the stack fill but I can't see why. I suspect there is something else I am not setting up i.e. adding 64k to the end of the stack is not the whole story. I already do the +64k fixup to _sbrk so the heap should work correctly too, and same for the heap init at startup (a bug in the newlib heap; requires the whole heap to be allocated and freed before it can be used).

All modded code has been stepped through, too.

The trouble starts before the RTOS starts, but after timer tick interrupt is enabled, which may be another clue.

In the RTOS code there is this



which is the only reference to the entry SP value that I can find. The RTOS has its internal stacks in the 64k CCM, which is a separate area not used by anything else.

I wonder if perhaps there is some CPU sync stuff - the sort of thing which needs __DSB etc.

In fact looking at it again now I am confused whether the reference to 0xE000ED08 is actually reading the vector table and only that; the comment about the NVIC offset doesn't make sense.
« Last Edit: July 03, 2023, 08:12:50 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 851
  • Country: es
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #17 on: July 03, 2023, 08:19:22 am »
You have several local vars in that function changing the SP. Depending on compiler’s allocation decisions (assign everything to registers or to stack) some of them could suddenly “teleport away” after the asm line writing to SP, resulting in a totally unpredictable behavior (i.e. the fill part using trash values found in the new stack memory for base/size).
Either do the SP modification earlier in the asm startup or split this code into two separate functions and call them from a function without local vars.

Edit: E000ED08 is VTOR register and that code looks correct (read the table ptr from VTOR, read the first word via table ptr).
« Last Edit: July 03, 2023, 08:22:51 am by abyrvalg »
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #18 on: July 03, 2023, 11:34:52 am »
I thought asm is absolutely immune from optimisation. I guess you are saying the compiler might say to itself that sp has already been loaded so no need to load it again.

It amazes me that any C actually works :) Isn't a register "volatile"?

How about this



EDIT: I did that but it stilll bombs, so I will continue to dig around.

Does msp do anything, in the context of 32F4 interrupt system?
« Last Edit: July 03, 2023, 01:50:24 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 304
  • Country: ua
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #19 on: July 03, 2023, 01:58:47 pm »
Currently I am using heap_4 and this has been working perfectly - like all of FreeRTOS actually.

Use whatever you use, I am pointing out that heap_5 exists, cause noone mentioned it before, and it is relevant to your question.

By the way. What was that _sbrk snippet above, if you're using heap_4? _sbrk is for newlib's malloc. Do you intermix newlib's malloc and freertos's malloc ?
Open source embedded network library https://mongoose.ws
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #20 on: July 03, 2023, 02:21:35 pm »
Quote
and it is relevant to your question.

In what way? From the FR doc:

heap_4 - coalescences adjacent free blocks to avoid fragmentation. Includes absolute address placement option. heap_5 - as per heap_4, with the ability to span the heap across multiple non-adjacent memory areas.

AFAICT I don't need this. For the 32F417 and 32F437 my memory architecture is identical. The only difference is that the top end items are 64k higher up: the general stack (which is used only in main.c startup and then for interrupts) and therefore the available heap is 64k bigger. The heap base is the end of BSS as usual, and that does not move. I am not splitting the heap, etc.

Quote
What was that _sbrk snippet above, if you're using heap_4? _sbrk is for newlib's malloc. Do you intermix newlib's malloc and freertos's malloc ?

No. The "general heap" in my product is indeed the Newlib heap. It works; I merely had to make it thread safe by mutexing malloc() and free(). The code contained mutex calls but a) they were dummies b) no source was supplied; only a lib and c) the lib was non-weak so had to be weakened using objcopy (various past threads). The FR heap is used only by FR, internally, and is all in the 64k CCM. Like the LWIP heap is used only by LWIP, internally. And for good measure MbedTLS runs its own heap, inside a 48k block allocated on the general heap :) Yes, it all works. AFAICT FR builds the tasks on its heap and never frees them. LWIP does god knows what internally. MbedTLS likewise.

_sbrk also works; I've been tracing the code to make sure.

I don't think this issue is heap related. I suspect it is caused by my messing with the SP and not fixing up something else.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 851
  • Country: es
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #21 on: July 03, 2023, 05:45:10 pm »
I thought asm is absolutely immune from optimisation. I guess you are saying the compiler might say to itself that sp has already been loaded so no need to load it again.

No, I mean something like this:
Code: [Select]
PUSH {R4, LR}
SUB SP, #8 ; make space for local vars
LDR R0, =_top
STR R0, [SP] ; initialize start_of_stack var at SP+0
MOV R0, #0x100
STR R0, [SP, #4] ; initialize stack_size var at SP+4

LDR SP, =_estack+0x10000 ; your asm line
LDR R0, [SP] ; fetch start_of_stack
MOV R1, #0x53 ; ‘S’
LDR R2, [SP, #4] ; fetch stack_size
BL memset

memset would trash an area starting at random location and of random size because both LDRs from [SP+offset] are fetching some random values from locations off by +0x10000 (because SP has changed between STR and LDR). Kaboom!
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #22 on: July 03, 2023, 06:24:42 pm »
I just don't understand this.

I know that doing e.g.

void fred()
{
   asm volatile ("ldr sp, = _estack \n");
   ...
   ...
}

is no good because when fred() is entered, there is code there which sets up the stack frame (of a size appropriate to the local variables found within the function). But now I have moved SP so that stack frame is gone.

So if I want to transfer control (all these functions are never-return ones) to joe() and I want joe() to run on a different stack, I need to set up SP before calling joe().

Code: [Select]

// === At this point, interrupts and DMA must still be disabled ====
// Execute loader. Reboots afterwards.

extern char _loader_ram_start;
extern char _loader_ram_end;
extern char _loader_flash_start;

// Copy loader code and its init data to RAM.
B_memcpy(&_loader_ram_start, &_loader_flash_start, &_loader_ram_end - &_loader_ram_start);

// Set SP to top of CCM. This is not where the the general stack is but it doesn't matter because
// the loader always reboots at the end. This assignment can't be done inside loader because it trashes
// the stack frame and any local variables which are allocated *at* the call.

asm volatile ("ldr sp, = 0x10010000 \n");

// See comments in loader.c for why the long call.

extern void loader_entry() __attribute__((long_call));
loader_entry();

// never get here (loader always reboots)
for (;;);
}

I've changed things around a bit but still can't see the problem. The parms to memset have been verified by stepping through the code.

The circumstances where it fails are obscure but if I #undefine the 437 extra 64k tests it all runs. This is the code in previous discussion. b_main is entered from startupxxx.s init code, with SP loaded to top of 417 RAM:

Code: [Select]

// This is the original main() called from the startup_stm32f407xx.s code
// We don't do much here to enable the B_memset to fill the stack area. The real b_main
// sets up a big stack frame because it contains so 512 buffer(s) on the stack.
// See also [url]https://www.eevblog.com/forum/microcontrollers/32f4-arm32-huge-stack-frame-0x250-bytes-why/[/url]

void B_main(void)
{

// Get CPU type and store it

if ( B_HAL_GetDEVID() == 0x413 )
B_g_dev_id=417;
if ( B_HAL_GetDEVID() == 0x419 )
B_g_dev_id=437;

// Load SP to top of *real* RAM per CPU type
// This throws away previous stack but it doesn't matter

#ifdef EXTRA_64K
if (B_g_dev_id==437)
{
asm volatile ("ldr sp, = _estack+65536 \n");
}
else
{
asm volatile ("ldr sp, = _estack \n");
}
#else
asm volatile ("ldr sp, = _estack \n");
#endif

// Fill stack with "S". This used to be in the startup .s code.
// We do it here because it is easier to grab the CPU ID in C code (above)

// We are filling the stack area but the stack is used to call B_memset :)
// So we just reduce the filled length a bit. Only a few are needed.
// The syntax needed to pick up the symbols is weird; also used in _sbrk().

extern char _top;
char* stack_base; // base of stack (from linkfile)
stack_base = &_top;
extern char _Stack_Size; // size of stack (from linkfile)
char* stack_size;
stack_size = &_Stack_Size;

#ifdef EXTRA_64K
if (B_g_dev_id==437)
{
stack_base += 65536; // 32F437 has stack 64k higher up
}
#endif

B_memset((char*)stack_base,'S',(int)stack_size-256);

void B_main_real(void);
B_main_real();

// We should never get here

for (;;);

}

I already use
__attribute__((optimize("O0")))   
on the above function, to prevent any funny compiler business.
« Last Edit: July 04, 2023, 08:44:11 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4600
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #23 on: July 04, 2023, 12:51:08 pm »
There is something simple to watch out for:

A function allocates a stack frame for all local variables which the compiler found inside it. This happens immediately after it is entered.

If that function manipulates SP, any local variables cannot be used anymore because the stack frame has gone. They have to be statics.

But you can load the SP and call a function. That function will obviously use the new stack.
« Last Edit: July 04, 2023, 12:54:23 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7531
  • Country: fi
    • My home page and email address
Re: 32F417 / 32F437 auto detect of extra 64k RAM
« Reply #24 on: July 04, 2023, 01:15:15 pm »
It is impossible to fix this cleanly at run time, because we are talking about the address space available here.

Does your end user end up including the correct stm32f417xx.h/stm32f437xx.h header?
I assume they have to, because how would they access the peripherals correctly otherwise?

The two header files define macros FLASH_BASE (0x08000000), FLASH_END (0x080FFFFF and 0x081FFFFF); and only the '437 defines SRAM3_BASE (0x20020000) and SRAM3_BB_BASE (0x22400000).

So, what I would do, is tell the users to add a specific include, after all other includes in their main source file.  That file would contain for example
Code: [Select]
#ifndef thatfile_H
#define thatfile_H

#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */

#define  DEFINE_LINKER_SYMBOL(_name) \
    __attribute__ ((externally_visible, used, unavailable, section (".omit"))) \
    void _name(void) { }

#if (FLASH_END - FLASH_BASE) == 0x001FFFFF
DEFINE_LINKER_SYMBOL(__device_flash_2048k);
#else
DEFINE_LINKER_SYMBOL(__device_flash_1024k);
#endif

DEFINE_LINKER_SYMBOL(__device_ccmram_64k);

#ifdef  SRAM3_BASE
DEFINE_LINKER_SYMBOL(__device_ram_192k);
#else
DEFINE_LINKER_SYMBOL(__device_ram_128k);
#endif

#ifdef __cplusplus
}
#endif /* __cplusplus */

#endif /* thatfile_H */
and start your linker script with
Code: [Select]
PROVIDE(FLASH_SIZE = DEFINED(__device_flash_2048k) ? 2048K : DEFINED(__device_flash_1024k) ? 1024K : 0);
ASSERT(FLASH_SIZE > 0, "thatfile_H not included: unknown Flash memory size!");

PROVIDE(CCMRAM_SIZE = DEFINED(__device_ccmram_64k) ? 64K : 0);
ASSERT(CCMRAM_SIZE > 0, "thatfile_H not included: unknown closely-coupled data memory size!");

PROVIDE(RAM_SIZE = DEFINED(__device_ram_192k) ? 192K : DEFINED(__device_ram_128k) ? 128K : 0);
ASSERT(RAM_SIZE > 0, "thatfile_H not included: unknown RAM size!");

MEMORY {
    FLASH_BOOT (rx) : ORIGIN = 0x08000000, LENGTH = 32K
    FLASH_APP (rx)  : ORIGIN = ORIGIN(FLASH_BOOT) + 32K, LENGTH = FLASH_SIZE - 32K
    CCMRAM (rwx)    : ORIGIN = 0x10000000, LENGTH = CCMRAM_SIZE
    RAM (rwx)       : ORIGIN = 0x20000000, LENGTH = RAM_SIZE - 4K
    RAM_L (rwx)     : ORIGIN = ORIGIN(RAM) + RAM_SIZE - 4K, LENGTH = 4K
}

SECTIONS {
    /DISCARD/ { *(.omit) }
}
The .omit section is discarded, so that the flag linker symbols won't use up any room in the actual binary.

Alternatively, you can create header files 32f417.h and 32f437.h, which not only include the proper stm32f4?7xx.h, but also define the symbols used by the linker script to determine the address space ranges as above.

Then, the developer-user includes just that one file (which will also include any other necessary common includes et cetera), and off they go.
« Last Edit: July 04, 2023, 01:22:58 pm by Nominal Animal »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf