Author Topic: [Solved] Saving "printf" arguments for later? Real-time on slow processor.  (Read 9732 times)

0 Members and 1 Guest are viewing this topic.

Offline incfTopic starter

  • Regular Contributor
  • *
  • Posts: 154
  • Country: us
  • ASCII > UTF8
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #50 on: January 19, 2025, 12:13:23 am »
The OP hasn't said why that library must be used, nor what the library presumed about the environment in which it is executing.

If the library source is available, tweak that.

If not, talk to the creator and pay them to do a better job.

If not, use a more powerful processor or processors. If the processor is only just fast enough as it is, what happens when a small enhancement is requested. Or the cache misses are pessimal. Or an extra interrupt occurs.

Disassembling the source or what is on the stack seems a very brittle approach. Even if it appears to work now, changes to the library source or compiler or execution environment could subtly break things.

This is running on some proprietary application specific hardware, and the third party software is very "special" (over 100kLOC, formally verified to be "correct", very big, very complicated, written by someone else, and not at all economically maintainable by us) which makes it effectively immutable. Imagine reading system calls from ROM. "Just rewrite it", "Just use a different processor", etc. is not likely to happen on this particular device. The complexity and cost would be astronomical compared to just writing 20 lines of assembly language.

I have a suspicion that the ARM cortex M0 ABI documents might codify how these variadic calling conventions work. I need to check.

If we have to lock down the toolchain version "forever" (use GCC version x.y.z) that is fine.

We are using memory protection which lets us be fairly comfortable with shenanigans, if the printf process goes wrong, the system will fail gracefully.
« Last Edit: January 19, 2025, 12:42:00 am by incf »
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 21682
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #51 on: January 19, 2025, 12:52:54 am »
That's the first time I have heard something '...formally verified to be "correct"...' since the 80s. Since you mention "100kLOC" and "complicated", I presume the weasel word is "correct"; maybe it is a meaning unfamiliar to me.

What's the cost of it missing deadlines intermittently, because your testing didn't uncover the absolute worst case timing? More or less than 20 lines of code?

Probably unmaintainable by you, which leaves paying the library author to tweak their code.

Good luck.
« Last Edit: January 19, 2025, 12:57:34 am by tggzzz »
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7441
  • Country: fi
    • My home page and email address
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #52 on: January 19, 2025, 01:01:22 am »
I'm looking at the assembly and I think gcc stores the size of the variable arguments list using SP and R7.
No; SP is the stack pointer, and that version of GCC for that particular code just used R7 to store the original value.

For 32-bit ARM, see Procedure Call Standard for the Arm Architecture (32 bit) aka aapcs32; you can download the PDF under Assets from github.com/ARM-software/abi-aa/releases.

Additionally, because the stuff to be printed are variadic arguments, you also need to check the appropriate C standard (or its final free preprint) for the implicit conversions done to variadic arguments.  Essentially, all signed integer types smaller than int are passed as an int, all unsigned integer types smaller than unsigned int are passed as an unsigned int, and float values are passed as a double.

As an example, consider godbolt.org/z/5eMar68G9, to see the Thumb code generated for ARMv7e-m using AAPCS32 (unknown-abi) for various variadic function calls using GCC 5.4.1 with -O2; or godbolt.org/z/MT934Eer6 for same on ARMv6-m for Cortex-M0.  No parameter count in sight.  The format string is always in R0, and 32-bit parameters are packed in order to R1, R2, R3, and stack; with 64-bit parameters packed to "even" registers and then to stack.

Again, at the source code level this is much simpler.  Not all "third party libraries" are closed-source, so without seeing "closed source" or "source not available" –– "control" can mean anything ––, I will not assume that the sources are not available.
« Last Edit: January 19, 2025, 01:04:32 am by Nominal Animal »
 
The following users thanked this post: SiliconWizard

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 4072
  • Country: us
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #53 on: January 19, 2025, 01:06:15 am »
Disassembling the source or what is on the stack seems a very brittle approach. Even if it appears to work now, changes to the library source or compiler or execution environment could subtly break things.

Yes. As I mentioned, printf is a whole family of functions and the library may use a whole bunch of different variants for its printf-formatting needs. Catching all of them looks, as I said, like a dead-end to me, but hey. Good luck.

He doesn't need to actually intercept printf.  The library allows him to register a callback which is used for logging.  The callback is required by the library  to have the same signature as printf.

The only question is how to record the va_list parameters for later processing.  The only reliable, portable way is to use the va_* macros which requires at least partial parsing of the format string to identify the number and type/size of arguments.  I think this is the best way to solve this problem. Scanning the format string should be quite fast.  You don't need to care about anything except the data size.  I strongly suggest implementing this first and seeing if it meets the performance goals. 

If that's still not fast enough, then finding a way to memcpy the entire varargs list would be faster.  There is no portable way to do that, and generally not even any reliable way to do it in a non portable way. 
 
The following users thanked this post: Siwastaja

Offline incfTopic starter

  • Regular Contributor
  • *
  • Posts: 154
  • Country: us
  • ASCII > UTF8
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #54 on: January 19, 2025, 01:09:17 am »
That's the first time I have heard something 'formally verified to be "correct" ' since the 80s. I presume the weasel word is "correct"; maybe a meaning unfamiliar to me.

What's the cost of it missing deadlines intermittently, because your testing didn't uncover the absolute worst case timing? More or less than 20 lines of code?

Good luck.
We are not getting new silicon to work with, the system and the software is what it is, even if it is painfully slow. (writing this mostly to stave off future waves of duplicate suggestions)

I appreciate the concern.

I'd appreciate if future discussion was limited to the proposed assembly language solution dealing with C variadic arguments on GCC and ARM Cortex-M0. I'm fairly confident that all the other avenues have been suggested, thoroughly discussed, and weighed against all the other possibilities. Assembly language "won" as the fastest, easiest, and "best bang for buck" way of solving the problem. Even if it is a bit unpleasant. Edit: they were right

(sidebar conversion will continue no doubt, and many will try to convince me of the error of my ways, but I tired - some methods are slower than others. We need speed at any cost!)
« Last Edit: January 19, 2025, 04:49:35 am by incf »
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7441
  • Country: fi
    • My home page and email address
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #55 on: January 19, 2025, 01:10:33 am »
The only reliable, portable way is to use the va_* macros which requires at least partial parsing of the format string to identify the number and type/size of arguments.
My example code in reply #45 does this parsing, for all types supported by basic printf() implementations, except for positional formatting, variadic number of digits (*, .*, *.* format specifiers), and %n.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 16100
  • Country: fr
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #56 on: January 19, 2025, 01:18:23 am »
Disassembling the source or what is on the stack seems a very brittle approach. Even if it appears to work now, changes to the library source or compiler or execution environment could subtly break things.

Yes. As I mentioned, printf is a whole family of functions and the library may use a whole bunch of different variants for its printf-formatting needs. Catching all of them looks, as I said, like a dead-end to me, but hey. Good luck.

He doesn't need to actually intercept printf.  The library allows him to register a callback which is used for logging.  The callback is required by the library  to have the same signature as printf.

The only question is how to record the va_list parameters for later processing.  The only reliable, portable way is to use the va_* macros which requires at least partial parsing of the format string to identify the number and type/size of arguments.  I think this is the best way to solve this problem. Scanning the format string should be quite fast.  You don't need to care about anything except the data size.  I strongly suggest implementing this first and seeing if it meets the performance goals. 

Ok, then yes. Just use the standard variadic argument access. Those macros are pretty lightweight, but yes, just like printf and any variadic function, that requires knowing the number and types of arguments in advance.
I agree that it should all be much faster than what the actual standard printf does, but you still have to scan the format to determine the arguments and their types, and then access these arguments (using the va_ macros as you said) and store them somewhere. A mockup callback just logging the format strings themselves will show all the types of arguments they actually use and you can then restrict your final function only looking for a restricted set of types.

Certainly I second using the standard variadic macros and I'm not sure how more efficient you could get using assembly directly. The only "expensive" part will be indeed to scan the format for determining each type, but if you support, say, only a few of them (maybe int, unsigned int, float and string), this should be pretty fast. You can just look for unescaped % in the format string and for each %, look for the character for one of the types you support (which may not be right after the % char), and that should only be a few. I don't see how else it could be done as you can't analyze format strings at compile time in C, but even if you could (in C 2050?), here you don't even have access to the format string literals at compile time, as all you have at your disposal is a parameter passed to a callback function.
« Last Edit: January 19, 2025, 01:20:49 am by SiliconWizard »
 

Offline incfTopic starter

  • Regular Contributor
  • *
  • Posts: 154
  • Country: us
  • ASCII > UTF8
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #57 on: January 19, 2025, 01:20:23 am »
I'm looking at the assembly and I think gcc stores the size of the variable arguments list using SP and R7.
No; SP is the stack pointer,
...
Yes, it's the stack pointer. And I think it communicates the size of the variable size argument list in the sample assembly I posted. Correct? (In the specific example that I posted)

I'm fairly sure foo() and call_foo() uses the area between between the stack pointer and R7 as the variable parameter area.

call_foo() in particular makes it clear what is stored where in the stack.

I feel like I need to learn how to setup a QEMU emulator for ARM Cortex-M0+.

)
« Last Edit: January 19, 2025, 01:30:51 am by incf »
 

Offline ledtester

  • Super Contributor
  • ***
  • Posts: 3430
  • Country: us
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #58 on: January 19, 2025, 01:34:49 am »
Pretty much the exact same question on Stackoverflow:

https://stackoverflow.com/questions/1562992/best-way-to-store-a-va-list-for-later-use-in-c-c

The last response suggests there is a way to determine the number of arguments passed, but it might be runtime/architecture dependent.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7441
  • Country: fi
    • My home page and email address
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #59 on: January 19, 2025, 01:36:56 am »
I'm fairly sure foo() and call_foo() uses the area between between the stack pointer and R7 as the variable parameter area.
If you look at godbolt.org/z/MT934Eer6, you'll see it does not.

R7 is a register that functions must save (i.e., when they return it must have the same value it had when the function was called), but it has no other special meaning in aapcs32.  In your case, you only see it used as such because of the compiler options you are using.  If you use -Os or -O2 (as is common) and -march=armv6-m -mcpu=cortex-m0 -mthumb (as is usual for Cortex-M0 code), you'll see R7 is no longer used as such.

Of course, you could disassemble the closed-source library object code using printf() and verify, but I believe it would make more sense to do that to find out the string pointed to by R0 at each bl printf, and make sure you parse those correctly.
 

Offline incfTopic starter

  • Regular Contributor
  • *
  • Posts: 154
  • Country: us
  • ASCII > UTF8
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #60 on: January 19, 2025, 01:46:28 am »
I'm fairly sure foo() and call_foo() uses the area between between the stack pointer and R7 as the variable parameter area.
If you look at godbolt.org/z/MT934Eer6, you'll see it does not.

R7 is a register that functions must save (i.e., when they return it must have the same value it had when the function was called), but it has no other special meaning in aapcs32.  In your case, you only see it used as such because of the compiler options you are using.  If you use -Os or -O2 (as is common) and -march=armv6-m -mcpu=cortex-m0 -mthumb (as is usual for Cortex-M0 code), you'll see R7 is no longer used as such.

Of course, you could disassemble the closed-source library object code using printf() and verify, but I believe it would make more sense to do that to find out the string pointed to by R0 at each bl printf, and make sure you parse those correctly.

I initially only saw your first example for a different ISA than I was using.

If I understand correctly, the return SP address is placed on the stack when there are more than 3 arguments (and my sample code just happened to use R7 at both ends to get the desired return SP value into and out of the stack)
And I suppose SP is not advanced prior to calling if there is exactly 4 arguments.

I think the example you posted follows the same stack format as my gcc 14.2 example? And it shows the stack pointer, and the stack pointer return address stored as part of the call at least on the longer variants. I suppose when there are fewer than 3 arguments, it does not have to do anything to restore the stack pointer.



I'm fairly certain that gcc has to maintain ABI compatibility over long periods of time so that people can link against precompiled libc libraries and successfully call printf regardless of optimization level.

edit: some of my personal description of the register calling conventions might be wrong. I need to spend a day or two playing around to really see how it works.

The different optimization levels have to be doing the same thing. If it explicitly stores the stack return address with low optimization, it must be doing the same at higher optimizations, even if the compiler achieves it in a less obvious/direct way.

edit2: at higher optimization levels gcc appears to moves responsibility to setting up the stack pointer/return address "stuff" from the callee onto the caller. And often decides to "stomp" on the current stack and the fix it after the call returns
« Last Edit: January 19, 2025, 02:31:19 am by incf »
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 4072
  • Country: us
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #61 on: January 19, 2025, 02:53:07 am »
Caller vs callee saved registers are defined by the platform ABI.  It can be found in the ABI documentation and doesn't depend on optimization level.  Low optimization levels may not use all registers or may redundantly save and restore registers that aren't needed, so it may not be clear what is part of the platform specification and what is the required ABI.  On some ARM platforms the caller saved registers are effectively enforced by the architecture as they are automatically saved on interrupt.

On systems with downward growing stacks, including ARM, on entry to a function, the stack pointer points to the beginning (lowest address) of the parameter passing area.  Since most calling conventions push right-to-left that is "near" the first argument passed (or the first non-register argument).  From there it's an "easy" fixed offset to find the beginning of the variadic arguments.  Finding the end is much harder.

When using a frame pointer, the frame pointer generally points to the bottom of the previous functions stack frame.  In the simplest case this will also be the top of the parameter passing region.  However it's not that simple.  Frame pointers are optional and subject to optimization and even if used there may be other stack data dynamically added below it aside from the function parameters.  For instance parameters for other function calls, alloca() allocations, and other temporary variables spilled from the register file.

 

Offline incfTopic starter

  • Regular Contributor
  • *
  • Posts: 154
  • Country: us
  • ASCII > UTF8
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #62 on: January 19, 2025, 03:06:42 am »
I feel as if the existence of va_copy() would require that gcc provide enough information to make a copy of the variadic arguments.

I also see that each call to a printf always seems to allocate two extra words of "unused" stack. I need to confirm, but I believe those two words are populated later in the compilation process (at link time?) with the real stack bounds for use with gcc's builtin va_* functions.

I sort of wonder if any standards codify it. I also wonder about interoperability of va_copy with other compilers. I feel like my best bet is to understand how gcc's builtin works, and do whatever it does.

edit: fn_abi_va_list and arm_build_builtin_va_list appear to be relevant
« Last Edit: January 19, 2025, 04:24:45 am by incf »
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 4072
  • Country: us
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #63 on: January 19, 2025, 03:41:13 am »
I feel as if the existence of va_copy() would require that gcc provide enough information to make a copy of the variadic arguments.

va_copy just makes a backup copy of the pointer to the first argument, allowing you to iterate over the arguments twice.  It doesn't copy or even access the arguments themselves at all.
 

Offline Analog Kid

  • Super Contributor
  • ***
  • Posts: 1558
  • Country: us
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #64 on: January 19, 2025, 03:55:14 am »
For example, if it might be possible to copy a chunk of the printf stack? Having to parse the format string (to get the number and type of arguments) and copy the variadic arguments one by one is a step that I would not mind pushing off to a background thread.

You shouldn't have to parse anything. It's not as if you're going to be analyzing the text used to make the printf() call:
Code: [Select]
    printf (&formatString, arg1, arg2, arg3);

Since the arguments are pushed on the stack, you only need to look at the stack to retrieve them. Nope; see below. And the format string that tells printf() how to interpret the arguments is just another argument (a pointer to a string), one that you'll be passing to the "real" printf() once you have the time to do so, so no need to parse that either.

Or you could write your own printf() entirely. I did that years back, in assembly language (8086, for PC DOS), and it wasn't at all difficult. I wrote a stripped-down version, since I only needed to handle the following formats:

o %d
o %u
o %x
o %s


and of course various literals (\n, \t, \c, etc.)

If you can't do it in C you could use the microprocessor's assembly language.

Ack; it's been quite some time since I even used my printf(), so I forgot one basic thing that makes it a lot easier:
You don't have to look at the stack to determine how many arguments there are. You only need to look at the format string to see how many format specifiers (%d, %s, etc.) there are. There will be an argument for each specifier. (Assuming, of course, that the format string and argument list are properly defined. If they're not, woe be to the programmer anyhow.)

So you do need to parse the format string, looking for percent signs, but that turns out to be fairly trivial.
I could post the (assembly language) code if you like.
« Last Edit: January 19, 2025, 04:21:11 am by Analog Kid »
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 878
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #65 on: January 19, 2025, 04:16:27 am »
Quote
I already added buffering
Buffer or no buffer, the uart hardware has the ultimate say in the matter. First step is get the uart running as fast as possible. I run any uart debug output at 1MBaud, and any modern mcu like an M0+ will have no problem doing so. Depending on the combination of mcu speed and printf implementation, you will ultimately either be limited by the printf implementation or the uart hardware. If the printf code becomes the limiting factor then a buffer is of little use, if the uart hardware is the limiting factor the a buffer could mean you simply wait 'later' but still wait.

If you have a Segger debugger, external or built-in to a dev board, there is also Segger RTT which can move data quickly. Easy enough to redirect existing printf output to segger rtt. They have blocking (drop) and non-blocking (wait) options.

I think I would first figure out how much debug data is being generated. Create code that simply replaces the printf destination and only counts bytes. With that info you will at least be able to figure out if it is even doable with whatever baud rate you are using. It could be the uart is able to output the data at that rate but maybe the uart hardware buffer is shallow and you end up in a rx tx buffer empty interrupt quite often. If you have DMA, then that would be an option to eliminate a high occurrence of uart interrupts.
« Last Edit: January 19, 2025, 07:40:30 am by cv007 »
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7441
  • Country: fi
    • My home page and email address
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #66 on: January 19, 2025, 04:36:31 am »
If I understand correctly, the return SP address is placed on the stack
No.  The aapcs32 ABI for Cortex-m0 says that return address is in LR, R0-R3 contain parameters, and any additional parameters are on stack.  It is the caller that needs to reset the stack pointer after a call with additional parameters passed on stack, because the callee, the function being called, must preserve R4-R11 and SP (=R13).  There is no way to find out in the callee how many parameters were actually supplied, unless the caller tells it in an always-passed parameter (like the format string).

In your screenshots, the compiler options are such that it just happened to use R7 to store the old value of SP, so that [SP, R7) contained additional parameters.  The compiler is absolutely free to save the old value of SP whatever way it wishes.  Indeed, in the linked example, the compiler often simply adds to SP the amount of stack space used (less LR), then pops the saved LR value directly to PC.  (This is because bx lr is equivalent to push {lr}, pop {pc}.)

Put simply, when printf() gets called using aapcs32 ABI on Cortex-M0, all it knows is that R0 contains the address to the format string, and R1-R3 and possibly stack contain the parameters referred to.  Because the parameters are all of basic types (signed and unsigned char, short, int, long, long long, float, double) or types mapping to basic types (size_t, possibly ssize_t, intmax_t, uintmax_t, and ptrdiff_t), in this ABI they occupy either one or two 32-bit words.  The format string is the only one that can tell you how many and what the supplied parameters are.

Essentially, there is no way in aapcs32 ABI on Cortex-M0 (or in most other ABIs) for the func() function implementation to be able to differentiate between
    func(5);
    func(5, 4);
    func(5, 4, 3);
    func(5, 4, 3, 2);
    func(5, 4, 3, 2, 1);
    func(5, 4, 3, 2, 1, 0);
calls.  See godbolt.org/z/efTc8sKb7 for proof.

I feel as if the existence of va_copy() would require that gcc provide enough information to make a copy of the variadic arguments.
No.

In essence, va_list could simply be a signed integer, with negative values describing registers, first parameter corresponding to the most negative value, and zero and positive values to stack offsets.  Then, va_start() initializes it to the value corresponding to the first variadic parameter, va_end() does nothing, va_arg() returns the one or two-word value at the register or stack offset and advances it accordingly, and va_copy() copies the current signed integer to another va_list variable.

Most architectures do it somewhat like this, except that certain types of values may be stored in a separate register file.  For example, on SYSV ABI on x86-64, xmm0 to xmm7 registers are used to store the first eight double parameters.  Thus, va_list may be a pair of indices, or even a bitmap.  As the va_list type variable is passed as the first argument to the va_ functions, and C passes simple types by value, the exact implementation (in the C library stdarg.h) can be quite funky; for GCC, these are implemented as compiler built-ins (__builtin_va_start(list,param), __builtin_va_arg(list,type), __builtin_va_copy(listcopy,list), and __builtin_va_end(list)).

For example, GCC 9.2.1 for aapcs32 and Cortex-M0, tends to push {r0, r1, r2, r3} at the beginning of variadic function implementation, so that all the parameters are actually on stack in order: r0 at SP, then r1, r2, r3, followed by any parameters pushed to stack by the caller.  You can see this clearly in godbolt.org/z/efTc8sKb7.  Note that the reason r4 is sometimes pushed to stack even when it is not used, only to keep the stack double-word aligned; the number of registers pushed is always even.  (Clang tends to use r7, so it is not necessarily r4, could be r4-r8 or r10 just as well.)  Also note that GCC generates non-optimal code here; it does not need to preserve r0-r3, but sometimes does, using unnecessary amount of stack.  Function bodies in aapcs32 must preserve r4-r11 and SP (r9 might be special).
In any case, the func() implementation will not receive any information it could use to determine how many parameters were supplied, as you can see.  That must be provided by the fixed parameter(s), which in the case of printf() is the format string.

You can see the aapcs32 core register use here.
« Last Edit: January 19, 2025, 04:39:39 am by Nominal Animal »
 

Offline incfTopic starter

  • Regular Contributor
  • *
  • Posts: 154
  • Country: us
  • ASCII > UTF8
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #67 on: January 19, 2025, 05:08:34 am »
You all are right.

I genuinely have no choice but to spend hundreds of cycles scanning and the format string so that the stack pointer can get set back to the correct position when the function returns (at typical optimization levels)

I am honestly a little bit relieved, and a bit horrified at the same time. The varadic ABI is not what I expected (at high optimization levels). While now I know exactly what hypothetical* things I would have to do to make stack copying possible, the changes would be far more invasive than cobbling together several layers of standards compliant C.

*-finstrument-functions and -fprofile-filter-files=printf_wrapper.c
« Last Edit: January 19, 2025, 11:51:48 am by incf »
 

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 303
  • Country: ua
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #68 on: January 19, 2025, 09:22:16 am »
What's the main source of slowness though? Is it formatting, or output?
Some answers address the former, some answers address the latter (e.g. _write override). OP, do you have numbers for both?
Open source embedded network library https://github.com/cesanta/mongoose
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 16100
  • Country: fr
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #69 on: January 19, 2025, 09:23:43 am »
But as I said, I think you can implement a pretty efficient scanning of the format strings as all you need is to identify the type of arguments. You could make the scanning much faster if doing it by 32-bit chunks instead of byte by byte, but be aware that the Cortex M0 doesn't support unaligned access, so if the format string isn't 4-byte aligned, you'd have to scan up to 3 bytes first and then 32-bit per 32-bit. I think that's probably worth a shot.
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 7538
  • Country: pl
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #70 on: January 19, 2025, 09:28:22 am »
Certainly I second using the standard variadic macros and I'm not sure how more efficient you could get using assembly directly. The only "expensive" part will be indeed to scan the format for determining each type, but if you support, say, only a few of them (maybe int, unsigned int, float and string), this should be pretty fast. You can just look for unescaped % in the format string and for each %, look for the character for one of the types you support (which may not be right after the % char), and that should only be a few. I don't see how else it could be done as you can't analyze format strings at compile time in C, but even if you could (in C 2050?), here you don't even have access to the format string literals at compile time, as all you have at your disposal is a parameter passed to a callback function.

This can be done ahead of time, if you extract format strings from the binary blob and generate a C function for each of them.
 

Offline incfTopic starter

  • Regular Contributor
  • *
  • Posts: 154
  • Country: us
  • ASCII > UTF8
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #71 on: January 19, 2025, 12:49:52 pm »
If I had the freedom to change the third party library, now that I've learned what I currently know (the variadic ABI does not pass a return stack address, nor any size or argument count metadata, etc. etc.) the optimal way to solve this would have been:
  • write a program to scan for each type of printf like function call (there are a handful of different flavors) and swap it with a fixed function call. For example my_printf("foo %d bar %x baz %s, 1, 2, "qux") would become my_printf_dxs("foo %d bar %x baz %s, 1, 2, "qux")
  • (optional) write a program to do the reverse to enable one to verify that the modified software is identical to what you started with (you don't necessarily know it worked properly each time on the >100kLOC library unless you can perform a diff against the original source, not to mention that each new release of the library will have to be modified in the same identical way to avoid manually merging/maintaining and inevitably making mistakes)
  • (I assume it could be made to work around some calls being several later macros versus others being straight function calls by storing the original line in a comment or something, lots of multi-line statements)
  • manually implement a bunch of wrappers for my_printf_xxx to store arguments lists in buffers. Which would then be processed later

Now, I'm unlikely to do that since it seems too large to do correctly. A string scanner will likely be adequate.

Side note for future reference: Although, it does occasionally emit about 10kb of text in less than 100ms at one less-critical point. And that would mean probably at least ~50000 extra cycles (5 clocks per string byte to copy and go to jump table) at approx ~10MHz. Not great.
« Last Edit: January 19, 2025, 01:10:32 pm by incf »
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 878
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #72 on: January 19, 2025, 01:35:05 pm »
Quote
occasionally emit about 10kb of text in less than 100ms at one less-critical point
That either means 1Mbaud or 100kbaud depending on what your b in 10kb means.

What is your uart speed, what is your mcu speed, what speed  is your mcu capable of, how is your uart moving the data- dma or interrupt, do you have a Segger debugger in use where Segger RTT is available for use.

Speed is going to be the best solution.
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 9528
  • Country: fi
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #73 on: January 19, 2025, 01:47:40 pm »
Maybe something like this:

* Have a buffer in RAM,
* Instead of copying characters from format string, store a pointer to the format string (because it's very likely at statically allocated .data/.rodata address)
* Parse the string enough to determine sizes (as in Nominal's example code); copy the data as is to the buffer
* When you have time, process the buffer by fetching the format string and running fully implemented printf - you can run again your own code which processes the format string to know how much size each argument occupies.
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 21682
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Saving "printf" arguments for later? Real-time on slow processor.
« Reply #74 on: January 19, 2025, 02:10:16 pm »
Maybe something like this:

* Have a buffer in RAM,
* Instead of copying characters from format string, store a pointer to the format string (because it's very likely at statically allocated .data/.rodata address)
* Parse the string enough to determine sizes (as in Nominal's example code); copy the data as is to the buffer
* When you have time, process the buffer by fetching the format string and running fully implemented printf - you can run again your own code which processes the format string to know how much size each argument occupies.

At runtime determine the address of a format string.
If it is the first time that address has been seen, parse it and store the definition somewhere in RAM.
If the address has been seen before, use the previously parsed definition.

Of course you have to hope:
  • format strings aren't dynamically created
  • you see all the format strings "early" in the execution run
  • no previously unseen format strings have to be parsed during a panic with short latency requirements
  • you don't care about the jitter introduced by parsing and lookup

Your system isn't hard realtime, and does have sufficient "excess" processing power. Doesn't it?

So, what's the application domain? I hope it is neither military nor healthcare.

My bet is the FinTech industry. They do weird things[1] with strange constraints, and aren't known for being averse to getting things wrong occasionally. I've seen, ahem, less than stellar hardware and software deployed, which matches the "defined to be correct" and "can't modify" constraints .

[1] e.g.
  • buy up microwave telecom towers/masts between Chicago and New York, because the speed of light in air is faster than in glass
  • implement business rules in hardware, i.e. FPGAs
  • lay dedicated transatlantic fibre cables, to ensure latency and bandwidth isn't affected by third parties
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf