For those who'll read this thread - IMO the explanation by Nominal Animal is the most complete, I am talking about
https://www.eevblog.com/forum/microcontrollers/memory-model-for-microcontrollers/msg5478241/#msg5478241 However I think it is a bit advanced. I'll take the liberty to put Nominal Animal's explanation in simple terms.
Assume we take STM32 ARM microcontrollers as hardware, and GCC as a toolchain.
The microcontroller's memory has many regions, and simply speaking, there are:
a) flash region: contains firmware code. You can read bytes from that region as if it is a simple RAM , but writing to that region is a complex procedure.
b) RAM region: you can read and write anything there
c) peripheral controllers: memory-mapped IO. reading and writing to that memory produce special effects depending on a controller. For example, writing to a specific place inside a GPIO controller may turn GPIO pin (e.g. LED) on or off.
|-------[ FLASH ]--------[ RAM ]----------------------[ GPIOA ]--[GPIOB]---[ other periperals controllers....]------|
So, that's what hardware defines. There is no .text, no .heap, nothing like that. Just those memory regions, and if you manage to create a firmware code - a .bin file, and write it to flash, that's all what matters. An MCU starts executing that code from the FLASH region, and the code can access RAM and peripherals, that's it.
Now, you can use different programming languages / compilers to produce a firmware binary. If you use C, then your program is, simply speaking, a set of functions and variables. For example, this:
static int led_state; // A variable. Goes to the RAM region.
void toggle_led(void) { // A function. Goes to the FLASH region
GPIOA->BSRR |= ..... led_state ... // A piece of code that accesses a variable from RAM region, and a controller in GPIOA region
}
int main(void) {
....
toggle_led();
}
The task of the toolchain is to produce a firmware binary. Inside the binary, as it stated above, there are no notions of .text/.heap/.data. Only machine codes, and data pieces at specific addresses.
So how the toolchain produces that binary? There are several steps:
a) GCC compiler compiles your code into the object files. Object files have ELF format. That format specifies so-called sections. Code (functions) go into the .text section. Data (variables) go into the .data section. Node that other languages can also produce ELF, so this is not really C specific.
b) Produced object files linked together by a linker to produce an output firmware.elf file. A linker uses a "linker script" file that has instructions like this: "put .text section to the memory region that starts at 0x8000000, and .data section at memory region that starts at 0x20000000". Of course, 0x8000000 is FLASH region, and 0x20000000 is RAM region. That's the place where ELF sections are mapped to the hardware memory regions.
c) An "objdcopy" tool extracts .text and .data sections from firmware.elf and concatenates them into firmware.bin. So essentially firmware.bin file is a collection of all functions, followed by all variables.
That's it!
Now, if you look closer, then you see that GCC would need to create two functions from the above examples: main() and toggle_led(). So in the FLASH memory, there will be a main() code, followed by the toggle_led() code (or vice versa). Likewise, for variables, there is a int variable "led_state" that needs to go into the RAM section. If there are more variables defined by the program, they will be placed in the RAM region one after another.
That's what they call a compile-time allocation. When a linker is done with arranging functions and variables to appropriate places, it generates the .map file if the appropriate build flag is given.
When all variables are placed in RAM, there is a free RAM space left. That will be heap. And at the very end of RAM region there is stack. So if we magnify RAM region, we'll see something like this:
---[led_state|var2|var3|...|end| HEAP---> <-- STACK]-------
|<--------------------------------- RAM --------------------------------->|
Usually linker scripts mark the end of all variables with a special symbol "end", so that the malloc function may know where heap starts. Also from the linker script file, it knows where RAM ends, and what's the maximum configured stack size. That's how malloc knows its boundaries.
This is my understanding of the memory model works, and its interaction with ELF sections. Note that this description is simplified, I've omitted a bunch of stuff like vectors, startup code and .data relocation, BSS, etc etc.