embed the behavior within data_type
Yup, exactly.
Let's say we construct a completely new string formatting system, based on say
'{' [ number ':' ] type [ '/' options ] '}'where
number identifies the parameter to be formatted,
type is a short string identifying the formatter, and
options is a string passed as-is to the formatter. The number is useful in localization, so that the order of items in the string can differ from the order of the parameters to the function. The actual formatter function interface could be something like say
int formatter(buffer, pointer-to-data, options, context)This assumes that instead of passing variadic arguments to be formatted (to the actual printf-replacement), we pass their addresses instead. This avoids the standards-required type promotions for variadic arguments, like
float to
double. So, one could print e.g. Hello world to
Serial using
format_string(Serial, "Hello, world!\n");or say a debugging message related to the user poking a touch panel using
format_string(Serial, "Touch event at ({1:i},{2:i})\n", &x, &y);If we limit the implementation to systems that support ELF object files and other formats with section support, we can use section magic to autoregister formatters at build time. That way, no RAM is wasted in describing the base set of formatters. Then, if someone wants to implement their own formatter, all they need to do is define the formatter function, and then use a preprocessor macro, say
DECLARE_FORMATTER(formatter, "type", context);to add their own to the same list. The macro emits a data structure to a dedicated section, so that the linker gathers them all into a single array in memory. On a 32-bit system, it would make sense to limit the type names to 8 characters, so that the structure would be 16 bytes (and reside in ROM/Flash).
(It would be nice to sort that array as a post-linking step, so that a binary search could be used to quickly find the proper formatter.)
Since the linker uses the addresses of the functions declared as formatters, it can trivially leave any undeclared formatters out from the final binary. This means preprocessor macros can be easily used to control what formatters are available by default.
On an AVR with separate address spaces for Flash and RAM, you could have an option that indicates when the pointer points to the not-default address space, or you could simply have different formatters for e.g. strings in Flash vs. strings in RAM.
If the buffer interface is not just an array, but also supports a
window to the output buffer (so that in cases where the data is too long to fit into the buffer, only some of it is stored in the buffer, and another identical call but with a different window will generate more of it later), the same formatting interface can be bolted on to any kind of stream, FIFO, socket, file, or other contraption.
The interface itself should be re-entrant, so that each formatter can use the same interface to implement itself. For example, a formatter for date might use the
context parameter to point to the current "locale", and format the date using said locale automagically, by picking the formatting string based on the locale. There is some risk here for accidentally deep recursion, though, if users create silly formatters.. but C does not protect silly people from creating footguns, so I think it is mostly a documentation issue.
This is just an example, but hopefully gives an idea how one could replace printf with something much better, much easier to control –– and based on user-extensible formatter functions. And all designed to support embedded development, especially when tight on RAM.