Is there a tool which one can run on a whole project and which can report which functions are __weak but are actually being called by real code?
For '
are actually being called', no. For '
may be called by real code', sure.
The difference is this:
if (prng() == 42) foo();Is
foo() called or not?
If you want to know whether
foo() really gets called, you have in your hands a variant of the classic
halting problem. The only way to know is to instrument
foo() and run it. For some specific cases, it is possible to instrument the internal state (variables, function result values) that affect whether the function gets called, and prove the conditions when the function gets called and when not.
However, if we want to know if the object file defines functions that are not called and whose addresses are never taken (noting that some function attributes like
__attribute__((constructor)) and
__attribute__((destructor)) actually cause the
compiler to take the address of the function; it does not need to be explicitly taken), sure!
ld itself does exactly that when you compile stuff with
-ffunction-sections -Wl,--gc-sections, to discard unneeded functions.
The procedure differs a bit when one uses dynamic linking (for stuff running under a proper OS with virtual memory), so I'll specifically limit to statically linked stuff, since microcontroller stuff is almost always statically linked.
You just generate a list of all function type symbols in the object files the project generates, the intermediate ELF .o files. We then go through the list, and check that each defined function symbol is referenced, or it is unused; and that each referenced function symbol is defined at least once, or it is used but not implemented. Simples!
There are two ways you can do this. The simple way is to parse
objdump -t output. The other way is to parse the ELF files yourself. In the latter case, you actually only need to find the symbol and string tables, since they contain all the information needed. (All string-form data is used by reference, so it is a bit annoying; but the format is utterly stable. On Linux, you can just use the already provided
<elf.h>.)
In the
objdump -t output, we are interested only in the SYMBOL TABLE, the lines that begin with a hexadecimal digit. On elf32-littlearm (Armv7e-m Cortex-M7), this has five fields per line, with the second field being fixed-width, containing flags.
F denotes functions, and
w denotes weak symbols;
l (lowercase letter L) denotes local symbols, and
g denotes global symbols. We only need the second, third, and fifth fields.
When the second field has
F and
g in it, it defines a function symbol (a weak one if there is also a
w). We need these in one list.
When the second field is all spaces and the third field is
*UNK* (exact string, not a pattern), it means that something in that object file references that symbol. We want these in a separate list.
Using your favourite scripting language, even Bash works fine for this if you use
export LANG=C LC_ALL=C to explicitly set the default C locale, extract the two cases into separate lists or dictionaries. Then, it is just a matter of looking up which functions aren't referenced.
In C, a hash table on the function symbol name, with the data specifying whether it has been defined (and optionally in which object file or files and with which attributes), and whether it has been referenced (and optionally in which object file or files), should work very well and be extremely fast (bound by I/O bandwidth and latencies).
If you are asking for a tool that you can just install and run, I can't help you with those. They may or may not exist; I do not know.