1
Microcontrollers / Re: Memory model for Microcontrollers
« Last post by Nominal Animal on Today at 07:04:02 am »but I'd claim that the "model" remains the same.Yes. It can be extended by additional sections if using ELF-based toolchains, and its layout is controllable by the linker script or linker options, but the underlying idea/model –– and purpose –– remains the same.
Thinking about the ways you can maximally utilize your Flash and RAM is the same thing as thinking about how to best use the "model".
(I personally am a big proponent of using those additional sections, to both provide useful build-time and run-time features, but also making the source code more maintainable in the long term. I use them on all sorts of architectures, from small AVR firmware code, to fully featured x86-64 Linux applications.)
Arduino environment is not the best example of how to utilize the "model", especially on older AVRs. The design of the "model" in Arduino environment is a compromise: it was heavily limited by the choice of the GNU g++ compiler, which cannot handle the separate address spaces and machine instructions used to access Flash and RAM, for technical reasons. The GNU gcc compiler can, but it would have limited the Arduino sketches to freestanding C, instead of freestanding C++ which they chose to use. (Clang can, in both C and C++, but for AVRs you'll want version 16 or later, which was released in late 2023; so it wasn't a possible choice for Arduino, until very very recently.)
Fortunately, for ARM Cortex-M cores and many others, those with a single address space for Flash and RAM, the "model" does not have such low-level restrictions. Even later AVRs have such a unified address space, because it makes development for them in C and C++ so much easier.
If we ignore small differences like whether there is a stack, one stack for return addresses and data, a return address stack only, or separate stacks for return addresses and function local variables, I do believe the base model originates in the very first true operating systems that allowed executing programs and returning back to initial system state afterwards; i.e. not simply grabbing the hardware, but "playing nice" with others.QuoteThe "memory model" of having text, data, bss, stack, and heap is a C language thing.Hmm. I should clarify that AFAIK, this model originated in C (on the PDP11), or perhaps one of the C predecessors (Algol? BCPL? The Fortran compilers I used didn't have a stack.) However, it's a pretty useful and workable memory model that is now used for many other languages as well.
I mean, surely there were (similar) implementations before that, too, but at that point in time, such a "model" became mandatory for tracking the resources a program would use, and to provide interfaces for such programs to request and release data memory from the operating system. (I do believe that for predecessors to ANSI C, the first such interfaces used brk() or sbrk() instead of malloc()/free().)
Simply put, dividing the memory address space needed into code, uninitialized data, initialized data, and optionally stack, is the minimum requirement for such tracking and management. It also turns out that even when developing embedded firmware or kernels, such divisions (and more detailed subdivisions and additions) are useful and effective for a number of purposes, so it just stuck.
In a very real sense, it is the minimum common denominator among machine code memory use organization.
I just recently saw someone making the argument that C has come to define the de facto standard ABI for pretty much all other programming languages.ABI, or application binary interface, is a different thing, to the "model" discussed here. ABI describes how data is passed between different programs, or between an "application" and a "kernel" or "library".
C basically requires a calling convention where one or more variables can be passed to the callee (called code) by value (i.e., changes done in the callee are not visible to the caller), and the callee returns a single variable.
The argument is that some ABIs are designed to optimize this and this only, with other details like which registers the callee is required to keep unchanged, and which registers it can modify ("clobber" is the term used!) based on C; and this yields suboptimal results.
I agree. C is not the do-all, be-all of programming languages, and although it is currently the easiest for other programming languages to use for binding –– interfacing to code written in other programming languages ––, it should not be used as the only design criteria or guide.
Fortunately, compilers can support more than one ABI at the same time, even on the same hardware architecture. For example, the embedded ARM Cortex-M ABI –– arm-none-eabi or arm-none-eabihf in ELFy terms –– is different to the ABI used in Linux (arm-linux-eabi or arm-linux-eabihf). (The hf at end indicates hardware floating point support.)
32-bit Intel/AMD x86 and 64-bit x86-64 has several different ABIs, originating from different operating systems.
Linux happens to use the SysV ABI on both x86 and x86-64. You can find its description for example here. The x86-64 System V ABI has one peculiarity I've found is very nice, even in C, because it was not designed solely for C in mind: it supports returning two different (integer or pointer) variables in registers (as well as passing up to six (integer or pointer) variables in registers). It means that I can do e.g.
Code: [Select]
typedef struct {
int64_t value;
int64_t status;
} i64_value_status;
i64_value_status my_function(int64_t a, int64_t b, void *cptr, long d, void *eptr, uint64_t f);
and all parameters and the result value and status will be passed in registers, without going through the "slow" stack.Thus, once again, the ABI thing is not really a technical question, but a human one: how does one agree upon a good ABI? Using C as your only yardstick is not a good way. But, humans being humans, when your only tool is a hammer, all problems look like nails; and many humans do tend to stick to a single tool.