I am also not running any C++ library, I've not dared get into that yet.
You don't need to, either; all you really need is a runtime startup that copies initialized data, and executes constructors.
Whenever you statically instantiate a class using an empty initializer body, you have initialized data that gets "set up" whenever you copy initialized data from Flash to RAM.
Whenever you statically instantiate a class using a non-empty initializer body, the compiler constructs a special no-parameter no-return dynamically-named function to execute the initializer body in the code section (typically
.text). I am not exactly sure about the language spec details between different C++ versions and implementations, but my suggestion is to assume that it may put compile-time constants (including references to other objects) as initialized data, or initialize them in the initializer function code. The
address of this dynamically named function is then added to the
.init_array section, which is simply a list of such pointers. For destructors, similarly for
.fini_array section.
In C (or
extern "C" scope in C++), you can use
__attribute__((constructor)) static void functionname(void) { ... } to add a locally-visible function to the
.init_array section pointer array in file scope order, or
__attribute__((constructor (priority))) static void functionname(void) { ... } in priority order, with 101 run first, 65534 just before those in file scope order, and 65535 in file scope order. Similarly for
destructor and
.fini_array (with inverted priority order, so that normally both constructors and destructors use the same
priority). See
GCC documentation for details.
In C++, you can use
__attribute__((init_priority (priority))) Classname instancename; or
__attribute__((init_priority (priority))) Classname instancename = Classname(...); to control the order of initializer functions; again 101 first, 65534 just before those in file scope order, and 65535 in file scope order. See
GCC documentation for details.
Note that underneath, both
constructor and
init_priority attributes use the exact same toolchain machinery.
With GCC/Clang/binutils toolchains on most ELF targets, the C runtime includes code to execute each pointer in the
.init_array as a function call before calling
main(), after copying initialized data from Flash/ROM to RAM. This is why minimal freestanding C++ support does not require a C++ library, only a C++-compatible C runtime.
In
reply #27 above, the
Closure example shows how you can create and use an array of pointer pairs (one a
void (*fun)(void *obj) function pointer, and the other a
void *obj object reference) to construct an array across a number of source files, and execute them at a desired point in your
main(). On 32-bit ARM, each closure is just 8 bytes, and can reside in Flash, so it is quite an effective method.
The main benefit of using a dedicated section (or sections) to collect entries across multiple source files is very useful for compile-time/link-time modularity. For example, if you look at my
RPN calculator example, the
DEFINE_OP() macro defines a stack-based operation by name, description, and function pointer, into a dedicated "
ops" section. In
main(), the
do_op() function traverses the array to find the operator name in the
ops section, and applies it if found. This means that modifying or adding new operators requires only recompiling that operator source file, and relinking the binary. The downside is that the array entry order is not easily controlled across source files. To sort them, I usually create a script that dumps the section in the linked object, and generates C source code reproducing the section contents in desired order. That way, sorting the section array contents is a post-link operation, running the script, compiling the generated C source code into an ELF object file, and replacing the array section in the linked file with the one from the ELF object file. ELF object files
can be directly modified, of course; it's just that each target architecture has its own relocation types, so both the array section and any relocations targeting it have to be modified. I've found that regenerating the array contents in C and compiling that is much more robust, even if it involves parsing
objdump output.