I'll take the liberty to put Nominal Animal's explanation in simple terms.
I didn't know what depth/complexity would be most useful, so did it what I thought would be most useful.
Having someone else do the same at a different depth/complexity is extremely useful; thanks!
How would one go about telling the compiler to put this lookup table in its own section in flash, and not in the RAM?
Generally, using
__attribute__((section ("sectionname"))) const type varname[] = { values..., };With older AVRs with separate Flash and RAM address spaces and separate machine instructions to access them, we have
__flash qualifier that causes the (C only for GCC, C or C++ for Clang>=13) to put it in the
.progmem.data section, so to use a different section, you use an alias:
__attribute__((section ("sectionname"))) static const type internal_varname[] = { values..., }; extern __attribute__((alias ("internal_varname"))) __flash const type varname[ sizeof internal_varname / sizeof internal_varname[0] ];Because of the
__flash qualifier, accesses to
varname will use the LPM instruction.
For compatibility, it is best to use
sectionnames that begin with
.progmem. .
Other compilation units (source files compiled to different object files) accessing the array should use prototype
extern __flash const type varname[size-if-known];where
size-if-known is the number of elements, or omitted if unknown.
(The reason this alias trick works correctly is that the
__flash attribute tells the compiler to generate the correct code (LPM et cetera, instead of LDS et cetera); but symbol references use only the name (without any section references). Thus, similar tricks will work in all compilers having such qualifiers, using ELF object files and symbol aliasing.)
The source code generated will be similar to
.section bar,"aw",@progbits .type array, @object .size array, size-in-bytes internal_varname: array contents .global varname .set varname,internal_varnamegenerating the expected machine code and object files;
internal_varname will be a local symbol, and not accessible from other compilation units.
Another option is to use only the
__flash qualifier, and compile the lookup table to a separate object file, then change the section name using e.g.
avr-objcopy --rename-section .progmem.data .progmem.name lookuptable.o .
Even older versions of GCC (I have 5.3.0, for example) generate acceptable code for AVRs, but for Clang, you'll want version 16 or later.
In general, the
__attribute__((section (sectionname))) used with a function or data object (variable or constant) picks up the "type" of the section from what it is used with for the assembler, so technically setting a section affects both the function or data object it is used with, and the section declaration itself. GCC and Clang linker scripts, however, use only the section name, so using common prefixes with '
.' separators makes writing linker scripts easier. Standard linker scripts, for example, tend to consider all sections starting with
.text. as just another
.text section, so with suitable standard prefix section names, one won't need to modify the linker scripts at all. (The
__flash or equivalent qualifier is needed to tell the compiler which instructions can be used to access the variable or data object. The AVR
io attribute works in a similar way, but for register and I/O bit manipulation.)
Would the compiler do this automatically if you declare the lookup table as static/const/readonly?
For certain targets (having a single Flash + RAM address space) using certain compiler options (
-fdata-sections to put each data variable or object into a separate segment) and linker scripts, yes.
For example, ARM Cortex-M targets using standard linker scripts and
-fdata-sections GCC or Clang compiler option would do this.
Would the compiler possibly generate some initialization code that copies the data from flash to RAM on startup for ease of access?
No, that is done by the boot loader or initialization code, which is also responsible for setting up stack; and on more complex microcontrollers, sets up clock sources, trains DRAM, and so on.
Do embedded tools typically give you control over this, and to visualize the binary layout that will be produced by the compiler/linker?
Yes. There are two separate cases: one where the boot loader is separate, possibly in a reserved Flash section, and does all the initialization; and the other where there is no bootloader per se (and the system cannot update its own firmware, it is done using external tools).
In the first case, the boot loader code base is separate. It gives some promises –– minimally in the form of the linker script specifying the memory ranges available to the application part, and the address of the symbol that is executed whenever the boot loader has done ––, which should be documented somewhere, especially if it also provides additional functions in Flash one can utilize, but the linker script does describe the minimum details and the layout.
In the second case, the initialization code is linked with the application part, so the final link map (you can obtain from linker tools) will describe the full firmware image. In this case too the linker script will describe the binary layout. (It varies whether the locations of memory-mapped peripherals are defined in the linker script, accessed via fixed-address pointers in the source code, or via fixed-address C/C++ extension attributes in the source code. The last one is common in older Harvard architectures with multiple address spaces, for example when bit access instructions are limited to a small I/O address range. So, the linker script may not describe how all peripherals are accessed, even if in the same address space; for those, you need to examine the source code.)