Author Topic: Best thread-safe "printf", and why does printf need the heap for %f etc?  (Read 16004 times)

0 Members and 1 Guest are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3694
  • Country: gb
  • Doing electronics since the 1960s...
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #50 on: August 16, 2022, 07:34:21 am »
There was also Newton-Raphson division, which used a multiply. I don't know the details (I know what N-R is; used it for square roots - the ((N/a)+a)/2 formula) but if you have a full size barrel shifter then multiplies are very fast.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4026
  • Country: nz
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #51 on: August 16, 2022, 08:26:36 am »
There was also Newton-Raphson division, which used a multiply. I don't know the details (I know what N-R is; used it for square roots - the ((N/a)+a)/2 formula) but if you have a full size barrel shifter then multiplies are very fast.

You need some initial approximation for the reciprocal. Then you use r' = r * (2 - n * r) several times to improve the reciprocal. Then you multiply by the thing you wanted to divide into.

Here's one approach to an initial approximation that you can do in hardware.

https://observablehq.com/@drom/reciprocal-approximation

Or you can use a ROM table. Most modern ISAs have a reciprocal estimate instruction, especially in their Vector/SIMD ISAs. The RISC-V Vector ISA explicitly specifies the 128-entry x 7 bit table to be used for reciprocal estimate and reciprocal sqrt estimate instructions:

https://github.com/riscv/riscv-v-spec/blob/master/vfrec7.adoc
https://github.com/riscv/riscv-v-spec/blob/master/vfrsqrt7.adoc

"The 7 bit accuracy was chosen as it requires 0,1,2,3 Newton-Raphson iterations to converge to close to bfloat16, FP16, FP32, FP64 accuracy respectively."

Where "close" is as in "good enough for 3D graphics purposes"
 
The following users thanked this post: DiTBho

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6227
  • Country: fi
    • My home page and email address
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #52 on: August 16, 2022, 09:17:37 am »
Although Bruce well explained it already, let me be me and describe how that \$x_{i+1} = x_{i} ( 2 + x_i n )\$ comes about.
I'm posting this because it is a simple and useful tool that can help in related problems, as peter-h already mentioned.

Newton-Raphson method is a root-finding method.  We have some differentiable function \$f\$ that has (an only) root at the desired \$x\$, \$f(x) = 0\$, and want to know the exact \$x\$ for that root.  Using \$f^\prime(x)\$ for the derivative, we iterate
$$x_{i+1} = x_{i} - \frac{f(x_i)}{f^\prime(x_i)}$$

In this particular case, we pick function \$f(x) = 1/x - n\$.  Then, the derivative is \$f^\prime(x) = -1/x^2\$, and
$$x - \frac{f(x)}{f^\prime(x)} = x + x^2 \left(\frac{1}{x} - n\right) = 2 x - x^2 n = x (2 - x n)$$

Oftentimes, the true "trick" is picking a good function.  For that, I recommend computer algebra systems like Maxima/wxMaxima, SageMath, Maple, et cetera.  For example, in Maxima/wxMaxima:
    f(x) := 1/x - n $ factor( x - f(x) / diff(f(x),x) );
outputs
    -x ( n x - 2 )
which is good enough.  (We really want x(2-xn).)
The $ is a separator (like ; except suppresses output).  factor() is a function that tries to factor the polynomial; tends to do better than e.g. ratsimp() in this kind of cases.  Then, just try different forms for f(x) you can think of, until the expression is something you can easily calculate.
« Last Edit: August 16, 2022, 09:26:12 am by Nominal Animal »
 
The following users thanked this post: DiTBho, tellurium

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26871
  • Country: nl
    • NCT Developments
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #53 on: August 16, 2022, 10:33:40 am »
probably another good reason to avoid a printf [...] which converts float to double.
Right; the requirement for C to promote variadic float arguments to double is an important reason why one should use something else instead.
Nahh, don't worry about that. In most applications formatting / printing numbers is not critical for performance.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3694
  • Country: gb
  • Doing electronics since the 1960s...
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #54 on: August 16, 2022, 01:50:34 pm »
Yes, which is why I wonder why it was decided to make the printf family use double as a default.

Double floats were not fast in those days... a PC was an 8086 running at 16MHz, and no hardware float unless you bought the extra chip.

If I need a fast "sprintf" I would do itoa() or ltoa(). If I needed a fast %f then practically always I control the format, e.g. %6.2f, and then doing a ltoa, printing a '.', and a second ltoa, is many times faster. Many years ago I was implementing HPGL parsing and generation (also Calcomp plotter language, etc, which was great fun) and those tricks were used because the moment the IAR Z180 compiler saw a %f it would go away for 10ms :)

Super clever stuff above about N-R iteration. I was always crap at maths (I do enjoy reading "popular maths" textbooks) but I do get it. They could not chuck me out of univ because despite failing all the maths exams I was getting 100% in electronics :)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3894
  • Country: gb
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #55 on: August 16, 2022, 04:07:36 pm »
The RISC-V Vector ISA

This is where MIPS4++ wins: RV32IMAC doesn’t have, but R18200 has a count-leading-zeros instruction, which is not directly supported by C but if encoded in assembly it's extremely useful during the Look up step when you have to choose the best initial estimate of reciprocal using leading 3 nonzero bits.

Modern ARM should have something similar.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14431
  • Country: fr
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #56 on: August 16, 2022, 07:14:55 pm »
And RISC-V is a highly modular ISA as opposed to most others. That 's by design. The Bitmanip extension does provide such an instruction. Of course a given core must implement this extension, which is not going to happen for a while at least on commercial cores since the extension has been finalized relatively recently...

 
The following users thanked this post: DiTBho

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4026
  • Country: nz
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #57 on: August 16, 2022, 09:02:15 pm »
The RISC-V Vector ISA

This is where MIPS4++ wins: RV32IMAC doesn’t have, but R18200 has a count-leading-zeros instruction, which is not directly supported by C but if encoded in assembly it's extremely useful during the Look up step when you have to choose the best initial estimate of reciprocal using leading 3 nonzero bits.

Reciprocal estimate is a floating point instruction. Floating point values are normalised, meaning the (implied) MSB is always 1, so there is no shifting needed. Denorms do exist, but the reciprocal of a denorm with two or more leading 0s is infinity anyway.

It would seem kind of crazy to do Newton-Raphson using integers. It would be tricky to avoid under/overflow.

Quote
Modern ARM should have something similar.

Modern RISC-V has CLZ. The original RV64GC from 2015 (ratified essentially unchanged in 2019, now called RVA20) doesn't, but RVA22 does, or will once it's ratified. Any new chip aimed at "real" OSes coming out from now on will have it, along with the rest of the "B" extension, in order to be RVA22 compliant. The B extension instructions are easy to implement, and uncontroversial, and were ready to go well before ratification. Unlike the much larger and more complicated Vector extension, which is optional in RVA22.

Microcontroller people are free to implement CLZ or not, depending on whether it's worth it for their application. It's a large chunk of silicon to implement a fast CLZ and probably doesn't make sense if you don't have things such as multiply and divide.
 
The following users thanked this post: DiTBho

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #58 on: August 17, 2022, 01:08:58 am »
Quote
Steele & White correctly show that to do the job properly (i.e. accurately in every case), you need to use multi-word arithmetic, and not just 2 or 3 words but numbers hundreds or thousands of bits long.
Is the amount of storage required dependent on the number of digits that you actually want to print?  I would think that printing "several fewer" digits than are accurate in the internal representation would be easier.

The paper is here, BTW.  https://lists.nongnu.org/archive/html/gcl-devel/2012-10/pdfkieTlklRzN.pdf  I guess I should read it!

The desire to "print any floating point number to a decimal string, and then convert the string back to a floating point number, and it will have the EXACT SAME bits as the original one" seems an unnecessarily high bar, for the typical embedded system displaying a number to a human user. :-(

Quote
why printfs? why don't people get rid of it once and for all?
Is there an alternative for "formatted" output?  That used to be pretty important back in the fortran and cobol days, though I guess the main excuse in modern times is just "prettyness."  C's printf() seems to be a pretty reasonable compromise for all its warts (mostly caused by it being a function, rather than part of the language, I guess.)
I actually did some work on a Pascal compiler once, to add fortran-style formatted output.  It was pretty gross (and certainly couldn't have been done in pascal itself.)  I see This conversation about Ada, which amounts to "it's quite verbose and ugly."
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6227
  • Country: fi
    • My home page and email address
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #59 on: August 17, 2022, 02:26:07 am »
Is the amount of storage required dependent on the number of digits that you actually want to print?
Yes, if the range of values you want to print is clamped.

For example, you can create a function that prints double-precision floating point values with at most 19 decimal digits – for example ±ddddddddddddd.ffffff – it is trivial to do using a single 64-bit (unsigned) integer with exact output.

The problem is that unlike Fortran, C printf() does not clamp the range when using say %6.0f (which prints the double-precision value rounded to an integer).  It only ensures the output is at least six characters wide, so for +0.0 it prints "      0" and for -0.0 it prints "    -0"; but for e.g. 1e9 it still prints "1000000000".

If the value is very large, say 1e300, and you want it to print that value instead of a number that has the first 20 or so digits 9 or 1 followed by zeroes, and then some pseudo-random digits arising from rounding errors, you do need about a thousand bits of storage (or a bit less than that if you're clever) to do so.
We (brucehoult and I) know that it can be done in fixed amount of space, even though "current wisdom" is somewhat different; and brucehoult even has discussed about this with the authors of the "current wisdom" who agreed.  It does not matter if the output format is utterly silly, like say "%999999.9999", but the required space is determined by the magnitude and precision of the type used.

This is what some of my ramblings in this and the other related threads drive at: you could have a formatting function that splits into multiple implementations, so that if the format and range is limited to an "easier" one like the ±ddddddddddddd.ffffff above, you could use a very fast, no-extra-memory needing version even in an interrupt context, but it would just say No by returning an error if the value is not within the range and clamping is not allowed.  For the general printf()-like conversion without any range limits, the wrapper function would just step up to the next formatter function, with the big-slow (what printf() uses) being the final backup version.  If all you use are the clamped versions, you don't need the big-slow at all.

But you cannot really do this with printf() in any sane manner.  For example, you cannot tell printf() to print say ±999999.999 for numbers that cannot be printed in ±6.3 format.  It cannot tell if at runtime the big-slow is needed or not, so it would have to be included in the firmware anyway; and then the other versions become just extra bloat –– a runtime speed optimization! –– which is not in the purview of a standard/base C library at all.

Thus, a different interface is needed.  As long as one is limited to the printf() interface, you have to be able to output the entire range and full precision.  I guess you could play with some preprocessor macros that specify some global precision and range limits used, but then you need to recompile not only your embedded firmware, but also the base C/C++ library it uses, every time you change those... Urgh, definitely a land of bugs and odd behaviour awaits there.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4026
  • Country: nz
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #60 on: August 17, 2022, 02:26:46 am »
Quote
Steele & White correctly show that to do the job properly (i.e. accurately in every case), you need to use multi-word arithmetic, and not just 2 or 3 words but numbers hundreds or thousands of bits long.
Is the amount of storage required dependent on the number of digits that you actually want to print?

No, the actual storage needed is dependent on the absolute value of the exponent. If you're not pushing the limits of 10±308 (or 10±38 for float) then you don't need anywhere near as much.

Quote
I would think that printing "several fewer" digits than are accurate in the internal representation would be easier.

Sure, if you want to print only an approximate rounded value then that's always much easier, especially if you don't care if something right on the cusp sometimes rounds the wrong way.

Quote
The desire to "print any floating point number to a decimal string, and then convert the string back to a floating point number, and it will have the EXACT SAME bits as the original one" seems an unnecessarily high bar, for the typical embedded system displaying a number to a human user. :-(

Sure, which is why embedded systems typically take all kinds of shortcuts, including arithmetic that isn't IEEE-compliant in the first place.

If you just want to put the value into something you can transmit over a communication channel that isn't binary-safe, you can print hex rather than decimal. Or some kind of Base-64 if you don't mind a little more bit shuffling.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4026
  • Country: nz
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #61 on: August 17, 2022, 02:39:11 am »
if the format and range is limited to an "easier" one like the ±ddddddddddddd.ffffff above, you could use a very fast, no-extra-memory needing version even in an interrupt context, but it would just say No by returning an error if the value is not within the range and clamping is not allowed.

Sure, if you're allowed to print "OVERFLOW" or clamp to the maximum value or something then no problem.

Quote
But you cannot really do this with printf() in any sane manner.  For example, you cannot tell printf() to print say ±999999.999 for numbers that cannot be printed in ±6.3 format.  It cannot tell if at runtime the big-slow is needed or not, so it would have to be included in the firmware anyway; and then the other versions become just extra bloat –– a runtime speed optimization!

The main issue in small machines is that you need to have the MEMORY SPACE available for the big slow version in any case, whether it's a static buffer, heap space that is maybe never deallocated, or stack space. I tend to favour stack space (about 1 KB for IEEE double) because usually you are printing from your program main function (or near to it), and you can use that stack space for computation code when you are not printing.

Quote
–– which is not in the purview of a standard/base C library at all.

It's entirely appropriate for the C library to pick a faster version that uses less memory when it knows it will do the job. That's extra code space, but code space is maybe less likely to be running out. If code space *is* a problem then compile you libc with appropriate options.
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6227
  • Country: fi
    • My home page and email address
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #62 on: August 17, 2022, 02:56:49 am »
The main issue in small machines is that you need to have the MEMORY SPACE available for the big slow version in any case, whether it's a static buffer, heap space that is maybe never deallocated, or stack space. I tend to favour stack space (about 1 KB for IEEE double) because usually you are printing from your program main function (or near to it), and you can use that stack space for computation code when you are not printing.
Me too; and splitting the formatting function into helpers not only makes the code easier to work with, but also ensures stack space is "reused" and thus minimized.

(GCC in particular is not very good at determining when it can reuse the same storage for different local variables in the same function.)

Quote
–– which is not in the purview of a standard/base C library at all.
It's entirely appropriate for the C library to pick a faster version that uses less memory when it knows it will do the job. That's extra code space, but code space is maybe less likely to be running out. If code space *is* a problem then compile you libc with appropriate options.
Well, I'd like to agree, but my (limited and not at all recent) experience with existing C library implementations is that the developers tend to reject such, saying it is outside the scope/purview of their project.

I don't know how I could have worded that better, but it seems to be so hard to get such stuff included in the existing open source standard C or base C libraries (like newlib) that I definitely would not bother even trying.  (Which is one reason I've been dabbling in designing something better for C from scratch.)

In this particular case, there is no need to throw the entire base C library away, just drop printf() (or at least its ability to print floating point numbers), and use something else instead.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3694
  • Country: gb
  • Doing electronics since the 1960s...
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #63 on: August 17, 2022, 09:48:36 am »
As I think I've said before, ever since C came out for embedded, say 40 years ago, one had

- integer-only printf options (mine was a stripped down fuller one, ints and longs, and I called it ilsprintf() )
- non-IEEE floats (the somewhat weird accuracy specs are irrelevant, especially on output where one is mostly doing e.g. %7.2f)
- no doubles (doubles have only highly esoteric applications and practically never in embedded systems)

None of the above need a heap, mutexes, etc. Even IEEE single floats don't need a heap.
« Last Edit: August 17, 2022, 10:03:05 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3894
  • Country: gb
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #64 on: August 17, 2022, 12:08:31 pm »
drop printf()

precisely!

it's simpler to have a show() method for each datatype.
- it consumes less resources
- it's less error-prone
- it's embedded-friendly
- it can be easily customized on demand
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online PlainName

  • Super Contributor
  • ***
  • Posts: 6818
  • Country: va
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #65 on: August 17, 2022, 01:41:50 pm »
Quote
it's simpler to have a show() method for each datatype.

Then you end up with:

Quote
puts("blah ");
show_type(val);
puts("blah blah: ");
show_type(stuff);
puts("\r\n");

Which just makes your code that little harder to read and change. So now you have an overarching function that encapsulates all that, perhaps giving an easily-modified string for the format and then the show stuff as paramaters, and all of a sudden you've reinvented printf();

If you want the show stuff then why not just rewrite printf to use those functions instead of whatever it came with? Then you get what you want and the code still remains compatible with standard libraries.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3894
  • Country: gb
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #66 on: August 17, 2022, 02:47:34 pm »
Then you end up with:
..
Which just makes your code that little harder to read and change.

Yup, a bit cumbersome and quite a bit more verbose, but it's perfectly fine!

Note that, as mentioned by @SiliconWizard, C11 can give you a generic "show()" method thanks to "_Generic", so you won't need to write show_${datatype}, just show() and the compiler will do the job for you.


In Ada you *mostly* write that show_${datatype} way(1), and it's perfectly ok!

Code: [Select]
datatypes="uint32, uint16, uint8, ... boolean, ..."
for datatype in ${datatypes}
       do
              use show_${datatype}()
       done

In C, show_${data_type}():
- is simpler to be read (especially when people pad printf(...) with 1981548...32171 parameters)
- is simpler for breakpoints or breakouts
- is simpler for patches, since you can apply a patch to each line and it makes sense

and, on the top of this, to *mimic* printfs() a language like myC would need to involve
- autoreference
- autotype
- monads
- local pool (with auto free) to support list but avoiding to use malloc&C
- list

All of this, just to allow you to write something like this
Code: [Select]
show
(
    @list
    (
        @string("blah "),
        @uint32(val),
        @string("blah blah: "),
        @stuff(stuff),
        @string("\r\n"),
    )
);

(1) to *mimic* printf, Ada offers even more complex inner mechanisms.

Is it worth it .... ?

I don't think so. Hence, thanks, but no :--
« Last Edit: August 17, 2022, 03:48:16 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6227
  • Country: fi
    • My home page and email address
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #67 on: August 17, 2022, 04:23:04 pm »
Which just makes your code that little harder to read and change.
Yes: a formatting interface where you construct strings piecewise using separate functions, will take some effort to get used to.

Thing is, if you want the benefits, something has gotta give.

(Side note: Maintainability is important, but not because it makes developers jobs easier.  It is important, because it reduces the number of bugs.  The true impact to maintainability cannot be measured how easy some code is to read or how nice it looks like; it must be measured by how likely it is for an expected level of developer to modify and maintain such code without introducing new bugs, how likely they are to notice bugs in existing code, and how easy it is for them to fix such bugs.)

In the Constructing short C strings in limited or constrained situations thread from a year ago, in reply #24, I outlined an example of a replacement for printf()-like formatting I've been mulling in my mind.  It is completely different to how printf() specifiers work, but would allow the kind of features that now require separate formatting functions ("show_type()").

If you want the show stuff then why not just rewrite printf to use those functions instead of whatever it came with?
Because printf() formatting interface does not let you specify things like "clamp to this width".

When you extend the printf() interface, the formatting gets even uglier; just look at what hoops the Linux kernel does: adds extra characters after the standard conversion specifier pattern.  They get really hairy really quickly.  (The benefit, however, is that the compiler can still check the variadic arguments for their correct types, and thus help avoid bugs.)

Or, put more simply, the proper interface to formatting functions is not just show_type(variable); it is
    status = format_type(destination, value, options);
and it is exactly the options part that requires a non-printf interface.

It is just that it is easiest to implement as separate functions, because a formatting specification language is something built on top of those.
(The standard C library implementations typically do not expose those separate functions at all.  Even optional functions like strfromf() use the printf formatting language.)
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3894
  • Country: gb
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #68 on: August 17, 2022, 05:54:39 pm »
Or, put more simply, the proper interface to formatting functions is not just show_type(variable); it is
    status = format_type(destination, value, options);
and it is exactly the options part that requires a non-printf interface.

Yup, even better: it's the step++ from the total demolition of printf;

  • day1, first step, you delete printf.c or whatever is the filename from your libc, then you recompile, but you have to be brave, don't listen to any colleague, you have to cross out every single mention in every single book that mentions it with an ash black marker, and hack every single eBooks with sed search and replace (with nothing, which is what printf deserves ... nothing of nothing)
  • day2, you spend the whole day celebrating the death of the big fat evil dragon. It's vanished, gone, never existed and never it will, may he rest in peace burning to pieces in the deepest worst computer science ideas ever.
    If you have a wwf badge because you have feelings for the protection of rare animals, well ... I know, I know, it might sound cruel, but in this case it's all a matter of the Newton's third law to move forward you have to leave something behind
  • day3, you introduce show_type(), simple piece of code, suitable for light C support library
  • day4, you spend the whole day appreciating how best it is
  • day5, next step, you eat a pancake batter and you make it better, so you evolve show_type into format_type(), a bit more complex, but not so much, more flexible and much more powerful

One week, 5 steps! may be you will need to extend day5 to a second week to perfection it, but hey? It's ok, it couldn't get any better  :D :D :D
« Last Edit: August 17, 2022, 06:39:27 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3894
  • Country: gb
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #69 on: August 17, 2022, 05:57:22 pm »
(
next next step, in the far far future, likely in a different galaxy where people measure the power of things by talking  about light side or dark side of things (the dark side is always unlimited, don't question) if are willing to add more features to a language like myC, well ... the best way to make it *dark side* is to ...

... embed the behavior within data_type and let format_type() take advantage of it  :o
)
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6227
  • Country: fi
    • My home page and email address
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #70 on: August 17, 2022, 07:23:09 pm »
embed the behavior within data_type
Yup, exactly.

Let's say we construct a completely new string formatting system, based on say
    '{' [ number ':' ] type [ '/' options ] '}'
where number identifies the parameter to be formatted, type is a short string identifying the formatter, and options is a string passed as-is to the formatter.  The number is useful in localization, so that the order of items in the string can differ from the order of the parameters to the function.  The actual formatter function interface could be something like say
    int formatter(buffer, pointer-to-data, options, context)

This assumes that instead of passing variadic arguments to be formatted (to the actual printf-replacement), we pass their addresses instead.  This avoids the standards-required type promotions for variadic arguments, like float to double.  So, one could print e.g. Hello world to Serial using
    format_string(Serial, "Hello, world!\n");
or say a debugging message related to the user poking a touch panel using
    format_string(Serial, "Touch event at ({1:i},{2:i})\n", &x, &y);

If we limit the implementation to systems that support ELF object files and other formats with section support, we can use section magic to autoregister formatters at build time.  That way, no RAM is wasted in describing the base set of formatters.  Then, if someone wants to implement their own formatter, all they need to do is define the formatter function, and then use a preprocessor macro, say
    DECLARE_FORMATTER(formatter, "type", context);
to add their own to the same list.  The macro emits a data structure to a dedicated section, so that the linker gathers them all into a single array in memory.  On a 32-bit system, it would make sense to limit the type names to 8 characters, so that the structure would be 16 bytes (and reside in ROM/Flash).
(It would be nice to sort that array as a post-linking step, so that a binary search could be used to quickly find the proper formatter.)

Since the linker uses the addresses of the functions declared as formatters, it can trivially leave any undeclared formatters out from the final binary.  This means preprocessor macros can be easily used to control what formatters are available by default.

On an AVR with separate address spaces for Flash and RAM, you could have an option that indicates when the pointer points to the not-default address space, or you could simply have different formatters for e.g. strings in Flash vs. strings in RAM.

If the buffer interface is not just an array, but also supports a window to the output buffer (so that in cases where the data is too long to fit into the buffer, only some of it is stored in the buffer, and another identical call but with a different window will generate more of it later), the same formatting interface can be bolted on to any kind of stream, FIFO, socket, file, or other contraption.

The interface itself should be re-entrant, so that each formatter can use the same interface to implement itself.  For example, a formatter for date might use the context parameter to point to the current "locale", and format the date using said locale automagically, by picking the formatting string based on the locale.  There is some risk here for accidentally deep recursion, though, if users create silly formatters.. but C does not protect silly people from creating footguns, so I think it is mostly a documentation issue.

This is just an example, but hopefully gives an idea how one could replace printf with something much better, much easier to control –– and based on user-extensible formatter functions.  And all designed to support embedded development, especially when tight on RAM.
« Last Edit: August 17, 2022, 07:26:14 pm by Nominal Animal »
 
The following users thanked this post: DiTBho

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14431
  • Country: fr
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #71 on: August 17, 2022, 07:31:01 pm »
One real problem with formatted printing is the security bomb luring if the formatting string either doesn't match the arguments or it's getting modified at run-time for an unexpected reason.
If the formatting string is a constant, the compiler may be smart enough to optimize this and generate code that would essentially look like the manual, separate function for each part of the format, but I'm not too sure about that.
 

Online PlainName

  • Super Contributor
  • ***
  • Posts: 6818
  • Country: va
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #72 on: August 17, 2022, 07:39:34 pm »
Quote
Hello world to Serial using
    format_string(Serial, "Hello, world!\n");
or say a debugging message related to the user poking a touch panel using
    format_string(Serial, "Touch event at ({1:i},{2:i})\n", &x, &y);

That's the kind of thing I had in mind when I said you'd1 just reinvented printf.

---
[1] The royal 'you'
 

Offline tellurium

  • Regular Contributor
  • *
  • Posts: 226
  • Country: ua
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #73 on: August 17, 2022, 08:28:57 pm »
  • day5, next step, you eat a pancake batter and you make it better, so you evolve show_type into format_type(), a bit more complex, but not so much, more flexible and much more powerful

Sounds interesting. Could you share a couple of examples, please? Like, a  typical log line that shows a bunch of values, or something like that.
Open source embedded network library https://mongoose.ws
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26871
  • Country: nl
    • NCT Developments
Re: Best thread-safe "printf", and why does printf need the heap for %f etc?
« Reply #74 on: August 17, 2022, 11:23:07 pm »
It's entirely appropriate for the C library to pick a faster version that uses less memory when it knows it will do the job. That's extra code space, but code space is maybe less likely to be running out. If code space *is* a problem then compile you libc with appropriate options.
Precisely! The problem isn't printf but the library implementation may not suit your needs. I use 3 or 4 different printf implementations in my embedded projects depending on formatting features needed. In the end the C library designers where pretty clever to come up with a text printing & formatting solution / convention (=printf) which is both versatile and can be light-weight if it has to.

Having basic type specific print functions is the worst idea ever. I've seen several people do that and it always ends in a mess. It isn't standard, it is not easy to use and circles back to why printf exists in the first place: it is a good basic way of formatting numbers. Don't try to re-invent a square wheel, printf is the perfectly round wheel.

If you still persist in using type specific printing then switch to C++ and use streams. You can even overload the formatting to print your own defined type.
« Last Edit: August 18, 2022, 12:29:05 am by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf