Author Topic: stdlib + MCU haters: std itoa() uses less resources!  (Read 17786 times)

0 Members and 1 Guest are viewing this topic.

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6236
  • Country: fi
    • My home page and email address
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #150 on: June 08, 2023, 02:46:34 pm »
Ah, okay.  I thought/assumed this was about code running on a microcontroller or similar embedded and constrained devices.
I can see why the folks here would assume that it's about microcontroller / embedded systems but (as we all know) C has more applications than just embedded.
No, we assume that because of the topic at hand, not in general.

I personally use C a lot in a fully hosted environment, for both systems programming (services like nginx or apache), and heavy-duty-computing and low-level I/O parts in desktop/GUI applications.

My preferred GUI application development is to have all heavy work done in C, but the user interface written in Python, so that end users can tweak the user interface without a development environment, even if the application innards are proprietary Secret Sauce.  (I develop code under various different licenses, from public domain to copyleft to completely proprietary: Tux is my mascot, not my idol.  I do prefer free/open source, because I'm more productive there myself, but I am not ideologically against proprietary-licensed code either.  I've even signed (sensible) NDAs.)

I have the same viewpoint as you, it's almost impossible to ditch the "old" C "standard library" - something else has to replace it with a completely new interface. But then you might as well create a new language with a similar or same syntax as C.
This is where our opinions diverge.  In my opinion, the C compiler is adequate, simply based on the amount of good C code out there.  The fact that there are vast amounts of not-so-good and downright horrible code has more to do with popularity and historical baggage than the C language properties itself.
Therefore, it seems to me to start from scratch would be to 'throw the baby out with the bathwater', as the saying goes.

Indeed, as I mentioned, I have been investigating exactly what would be better than existing GNU/POSIX C library in Linux, as crazy as it might be.
It is utterly different to what I think would be better than avr-libc/newlibc in microcontrollers (which I have also been investigating on and off, separately).

Yet, the base C language suffices for both for me.  (Well, I would like one addition/change: make arrays instead of pointers the base type, with a few attendant changes, all backwards-compatible, so that for code using such buffer underruns and overruns and other memory reference bugs would be caught at compile time.  The C compilers already do this, but we need some additions to help pass array size information across function call boundaries at compile time –– this does not add runtime data or new or hidden parameters to function calls, it is all about compile time.)

If GCC c++ were to grow clang-like support for named address spaces, then I would actually prefer a subset of c++ in AVR-class microcontrollers.
On Cortex-M's, my example code snippets tend to be in C because it is the lowest common denominator, but for my own code, I use the freestanding C/C++ mix I mentioned in a previous message.

Many disagree exactly which subset of C++ is appropriate on Cortex-M's and similar, and most disregard the effects of that subset on the library implementation and runtime requirements, which is why I tend to not participate in such discussions.  I even see C++ embedded developers wishing for exception support...  I tend to use templates, class encapsulation, class inheritance for common features like output formatting and input parsing, and function overloading, only, because that seems to be about the right balance for me, without requiring any runtime support (like stack unwinding for exceptions).

I would be surprised to learn that a MCU platform would support POSIX
newlib does contain most of POSIX C interfaces, and avr-libc says "Some additions have been inspired by other standards like IEEE Std 1003.1-1988 ("POSIX.1"), while other extensions are purely AVR-specific (like the entire program-space string interface)."

For example, both provide strdup(), memccpy() (note extra C, this copy stops when a specific separator byte is found), and even GNU memmem() (finds a specific byte sequence in a specified memory range, similar to strstr()).

Yet, because POSIX C is specifically System Interfaces, supporting it fully in a freestanding environment is not possible – there is no 'system' to interface to.

(For those who are unaware, POSIX C features are included in the standard C library on systems that do support them.  In some, like Linux, you do need to define feature test macros before including any header files, to expose the "extra" features, though.  In Linux, #define _POSIX_C_SOURCE 200809L currently exposes all POSIX.1 C features.  It is important to realize that this does not cause anything extra to be loaded into memory at run time, or any extra overhead of any kind; it just changes which of the features in the base C library are exposed at compile time.)
 
The following users thanked this post: SiliconWizard

Offline MMMarco

  • Regular Contributor
  • *
  • Posts: 69
  • Country: ch
  • Hobbyist. ⚠️ Opinionated
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #151 on: June 08, 2023, 04:37:07 pm »
No, we assume that because of the topic at hand, not in general.

I'm going to take your word for it, although I'm pretty sure I made myself clear by saying "[..] but C isn't exclusively used for MCUs either." hinting at the fact that I'm not talking about just MCUs / embedded.

This is where our opinions diverge.  In my opinion, the C compiler is adequate, simply based on the amount of good C code out there.

I wasn't talking about the C compiler but the standard library, you can't just expose a new clean API with existing standard names.

How would you call the "new" functions, types etc. in a "new standard library"? You'd have to prefix or rename everything which makes things incredibly messy.

In addition to that, many many million lines of C already have been written. So you would need (if you do it correctly) to choose names that aren't used in many codebases - good luck with that.

Hence, it wouldn't make sense (possibly one of the reasons why such a thing doesn't exist) to create a "new standard library" interface, it makes much more sense to start relatively from scratch and do it properly (Rust seems to do just that [I don't know about Rust's API but their model seems to be to check everything at compile time]).

What you're advocating is putting a band-aid over band-aids, its just not the right way to go.

C isn't going to change much (IMO) and that's probably a good thing.

I even see C++ embedded developers wishing for exception support

 :-DD
« Last Edit: June 08, 2023, 05:01:05 pm by MMMarco »
27 year old Software Engineer (mostly JavaScript) from Switzerland with a taste for low level stuff like electronics 😊

 

Offline AVI-crak

  • Regular Contributor
  • *
  • Posts: 125
  • Country: ru
    • Rtos
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #152 on: June 08, 2023, 06:12:22 pm »
In addition to that, many many million lines of C already have been written. So you would need (if you do it correctly) to choose names that aren't used in many codebases - good luck with that.
These words can cause pain.
Luckily, the GCC has a very short memory. He swears at the coincidence of names, but easily forgets his luggage, and starts working with a new one.

Here, a new function hex_char - makes a string in hex format from an address (number). https://godbolt.org/z/75snGKfsa
Perhaps a different name - suggest.
Other interface is possible - please suggest.
It will be very cool if the function remains working on all systems (from 8 bits to 64 bits) - it needs to be changed.
The function must be rebuilt according to the size of the pointer.

A very important condition is that 0xF must be padded to 0x0F, 0x12F ->> 0x012F, 0x1234F ->> 0x0001234F,
0x12345678F ->> 0x000000012345678F.
« Last Edit: June 08, 2023, 07:30:43 pm by AVI-crak »
 

Offline MMMarco

  • Regular Contributor
  • *
  • Posts: 69
  • Country: ch
  • Hobbyist. ⚠️ Opinionated
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #153 on: June 08, 2023, 08:05:26 pm »
These words can cause pain.
Luckily, the GCC has a very short memory. He swears at the coincidence of names, but easily forgets his luggage, and starts working with a new one.

I'm sorry but I don't understand. Who do you mean by 'him'?

Here, a new function hex_char - makes a string in hex format from an address (number). https://godbolt.org/z/75snGKfsa

I don't mean to be rude, but that code is written pretty awfully in my opinion.

Lots of "magic numbers" (0x57, 0x41 etc.)  without any comments to what they're doing, no clear indication what the particular lines are doing too.
27 year old Software Engineer (mostly JavaScript) from Switzerland with a taste for low level stuff like electronics 😊

 

Offline AVI-crak

  • Regular Contributor
  • *
  • Posts: 125
  • Country: ru
    • Rtos
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #154 on: June 08, 2023, 08:39:01 pm »
Code: [Select]
//tail_txt - pointer to the end of the array
char* hex_char(char* tail_txt, uint32_t value)//56
{
    uint32_t tmp;
    //0x57 - bin 01010111 - check for multiplicity of two
    //0x57 << to high digits of type int, checking high bit in a loop
    int nc = 0x57 << ((sizeof(int)-1)*8);
    *tail_txt = 0;
    do{
        do{
            tmp = value;
    //hex - take 4 bits per character
            tmp &= 0x0Fu;
    //numbers 0-9 do not create a negative number
            tmp -= 0x0Au;
    //difference between '9' and 'a' = 7 (very handy)
    //shift negative number as integer to value 7
    //works well on 16bit-64bit systems
            tmp -= tmp >> ((sizeof(int)*8)-3);
    //0x41=0x30+7+0x0A, 0x30 = '0'
            tmp += 0x41u;
            *(--tail_txt) = tmp;
            value >>= 4;
            nc <<= 1;
        }while (value );
    //the ability to leave the loop 2,4,8 steps
    }while (nc < 0);
    //sometimes GCC decides to do one entry instead of two
    *(volatile char*)--tail_txt = 'x';
    *(volatile char*)--tail_txt = '0';
    //pointer to the beginning of the string
    return tail_txt;
};

The code size is smaller. Magic can be dispelled. Perhaps you will succeed.
The "if" statement completely removes the magic, but reduces speed and increases size.
 

Offline AVI-crak

  • Regular Contributor
  • *
  • Posts: 125
  • Country: ru
    • Rtos
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #155 on: June 08, 2023, 08:44:52 pm »
Who do you mean by 'him'?
GCC,compiler
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6236
  • Country: fi
    • My home page and email address
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #156 on: June 08, 2023, 09:20:55 pm »
Both GCC and Clang generate quite good code for
    hex_char = digit + ((digit < 10) ? '0' : 'a' - 10);
across a large number of architectures.

For example, GCC generates a conditional jump on AVR costing only an extra cycle with -Os and -O2; a compare ; if-then-else ; load-if-less ; load-if-higher; add sequence on Cortex-M4/Thumb -Os and -O2; a compare; cmovbe or compare; subtract with borrow; and ; load effective address on x86-64 with -O2; and compare; subtract with borrow; and ; load effective address on x86 with -O2.

Simply put, it not only outperforms your 'trick', it is also much easier to maintain for us humans.

Tricks like subtracting ten from digit or adding six to it, and then manipulating the bit pattern to avoid conditionals just lead to extra instructions used on architectures where conditional addition/loading is trivially cheap –– like it is on Cortex-M4/M7/Thumb and x86-64.

It is optimization at the wrong level: if you want specific instructions, use extended assembly; otherwise, enable optimizations and let the compiler deal with it.  Doing it your way, you'll have to pore through your 'optimizations' for every single compiler and compiler version and instruction set architecture separately, and edit to get the desired output, as otherwise you have very unoptimal implementations.
 
The following users thanked this post: newbrain, MMMarco

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21657
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #157 on: June 08, 2023, 09:25:25 pm »
I think they're saying GCC (the compiler) doesn't care what you name things, i.e. namespace is only whatever's in present/local context.

The error is when you release such a project to the broader world; code doesn't exist to make something once and never touch it again.  Well, sometimes it does, and you can do whatever you want with that.  But at soon as someone else has to look at it, they will read it from their own perspective, subject to all their cultural biases.  Which, until they've read the entire codebase from top to bottom, they're just going to assume std_func() is the standard function.  A lot of contradictory ideas are thus formed, broken, reformed, and broken again, until one is fully familiar with the codebase.

The result is even worse for shared or open projects, where libraries need to be pulled from diverse sources and integrated into a given project; and projects into others and so on.  Suppose someone wants to use multiple libraries for similar but slightly different purposes; it would be quite rude if your library doesn't quite duplicate the others' function, yet has similar or identical naming; or if someone goes and reuses part of your code in another project, and doesn't know the nuances between these variants, or so on...

And probably a million other things I don't know, as I don't work in FOSS.  Even just a microcosm of that, I saw when trying to compile AVRDude on Cygwin. Loaded up, what I thought was all the dependencies I needed.  Cmake didn't make.  Some file conflicts.  Poked around some more.  Oh maybe the paths are wrong, because Cygwin is packaged slightly different or something.  Improved, but still hung on some libusb thingy.  But it's tagged correctly, seems to be the version it's looking for and everything...  Eventually after some digging, I found it was referencing some usb.h and other stuff from a different project.  Opened a ticket, and the official devs didn't have much for ideas either.  Granted, compiling on Cygwin is a veeery low priority for them.  I was a bit surprised as the wiki listed Cygwin as a platform (and maybe still does..?), but obviously it's not in their tests.  I did end up compiling it but I had to rename the offending file.

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline AVI-crak

  • Regular Contributor
  • *
  • Posts: 125
  • Country: ru
    • Rtos
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #158 on: June 08, 2023, 09:33:55 pm »
Both GCC and Clang generate quite good code for
    hex_char = digit + ((digit < 10) ? '0' : 'a' - 10);
across a large number of architectures.
Speed increases, size decreases - what am I doing wrong?
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6236
  • Country: fi
    • My home page and email address
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #159 on: June 08, 2023, 09:38:42 pm »
they're just going to assume std_func() is the standard function
Very true.

A secondary issue is whether the compiler does the same assumptions in freestanding environment, that it is allowed to make in a hosted environment (like printf("Hello\n") = puts("Hello")).  To be safe, the names and name patterns used by the standard C library for current (and future reserved) functionality has to be avoided.  Thus, it is not just humans, but also "murky" implementation details of C compilers in freestanding environments.

I prefer to do this by prefixing my function names with a (library name) prefix and an underscore, but there are other ways as well, CamelCase being one.
 

Offline MMMarco

  • Regular Contributor
  • *
  • Posts: 69
  • Country: ch
  • Hobbyist. ⚠️ Opinionated
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #160 on: June 08, 2023, 09:40:02 pm »
Both GCC and Clang generate quite good code for
    hex_char = digit + ((digit < 10) ? '0' : 'a' - 10);
across a large number of architectures.
Speed increases, size decreases - what am I doing wrong?

Code needs to be clear in order to be maintainable.

Code is there for other people to read¹, not for the "system" "specifically", so easier to understand/read is always preferred.

It's basically communication between the author of the code, the author itself (future you) and other people using the code.

Code is a tool invented by people, for people.

I couldn't make real sense out of your code until you (somewhat) properly annotated it with comments.

The best code is the code that can be read and understood by the least knowledgable person (in my opinion) [and has reasonable performance, of course].

That doesn't mean we should compromise performance for readability, but as with all things, it's a balance act.

¹ because after compilation, the machine doesn't care how you structure/put together your code ; if you're coding in assembly, it's a different story.
« Last Edit: June 08, 2023, 09:46:50 pm by MMMarco »
27 year old Software Engineer (mostly JavaScript) from Switzerland with a taste for low level stuff like electronics 😊

 
The following users thanked this post: nctnico

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6236
  • Country: fi
    • My home page and email address
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #161 on: June 08, 2023, 09:43:59 pm »
Both GCC and Clang generate quite good code for
    hex_char = digit + ((digit < 10) ? '0' : 'a' - 10);
across a large number of architectures.
Speed increases, size decreases - what am I doing wrong?
What compiler and options are you using, and what architecture are you targeting?

I'm using several compilers and target architectures at Compiler Explorer (godbolt.org) to examine the machine code.
 

Offline AVI-crak

  • Regular Contributor
  • *
  • Posts: 125
  • Country: ru
    • Rtos
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #162 on: June 08, 2023, 09:48:57 pm »
Code needs to be clear to be maintainable.
Ok, there are requirements for the function algorithm, there is my version, there is a detailed description of the magic.
Everything can be changed.
 

Offline AVI-crak

  • Regular Contributor
  • *
  • Posts: 125
  • Country: ru
    • Rtos
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #163 on: June 08, 2023, 10:00:10 pm »
What compiler and options are you using, and what architecture are you targeting?
Only modern actual architectures of cortex, risc-v. Any compiler settings except "-O0".
Practical test of cortex m0, cortex m7, risc-v CH32Vxxx.
x86 - only to check the integrity of the algorithm.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 825
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #164 on: June 09, 2023, 02:20:36 am »
Quote
If GCC c++ were to grow clang-like support for named address spaces, then I would actually prefer a subset of c++ in AVR-class microcontrollers.
Then use the newer avr0/1 and above series, which has code space mapped to data space so no need to deal with const data in a special way except to specify const. These newer series are better in about every way, also.

A simple example for a tiny416, showing a basic iostream-like formatter and some other things-
https://godbolt.org/z/nsh41oc9r

The Print class is limited for the example (and therefore not quite correct because of that), but at the heart of it is a few basic functions that do the work. The rest of it is manipulating the format options and directing values to their final destination. Apart from the hardware related things like uart/rtc in the example, the same code can be tested on x86 and moved to any other mcu which has c++ support.

The avr/0/1 series grew into other series, and of course they ran out of address space to map code space. For the series that can exceed address space they map only a 32k code segment at a time. Like all things 8 bit, eventually you have to deal with its limitations in some form.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4028
  • Country: nz
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #165 on: June 09, 2023, 02:31:45 am »
Both GCC and Clang generate quite good code for
    hex_char = digit + ((digit < 10) ? '0' : 'a' - 10);
across a large number of architectures.
Speed increases, size decreases - what am I doing wrong?
What compiler and options are you using, and what architecture are you targeting?

I'm using several compilers and target architectures at Compiler Explorer (godbolt.org) to examine the machine code.

Seems to me the following should often be shorter/faster code if translated literally -- except when compilers are too smart for their own good

Code: [Select]
hex_char = digit + '0';
hex_char += (hex_char <= '9') : 0 : 'a' - '0' - 10;

Or with an "if" for the second line.
 
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6236
  • Country: fi
    • My home page and email address
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #166 on: June 09, 2023, 05:17:47 am »
Quote
If GCC c++ were to grow clang-like support for named address spaces, then I would actually prefer a subset of c++ in AVR-class microcontrollers.
Then use the newer avr0/1 and above series
No, I like ATmega32U4 and AT90USB1286 with their native USB interfaces.

Besides, isn't that suggestion like telling someone to dig somewhere else where the ground is softer, when they mention their shovel is blunt?

The entire unified address space approach is a kludge, papering over a fundamental architecture detail, just because some compilers find it difficult to deal with.  Rather silly to me.

Seems to me the following should often be shorter/faster code if translated literally -- except when compilers are too smart for their own good
Code: [Select]
hex_char = digit + '0';
hex_char += (hex_char <= '9') ? 0 : 'a' - '0' - 10;
Why would you think so?  Because it's the best one on current clang trunk on rv32gc with -O2 and -Os optimization levels?

I base mine on the observations (Compiler Explorer test set 1, test set 2) across a few architectures on GCC and Clang trunks.  The single conditional form is not always optimal, but it seems to always generate something reasonable (for that particular compiler).

From such, it seems to me that the conditional expressions,
    hex_char = digit + ((digit < 10) ? '0' : 'a' - 10);
and
    hex_char = (digit < 10) ? digit + '0' : digit + 'a' - 10;
tends to be best recognized by the compiler wrt. optimization, compared to more spread out expressions.

Put simply, I want to make sure compilers don't generate stupid code.  I don't care how close they get to the optimal, as long as it is not stupidly far from it.
« Last Edit: June 09, 2023, 05:19:35 am by Nominal Animal »
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14440
  • Country: fr
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #167 on: June 09, 2023, 05:34:22 am »
It really depends on the compiler, optimization level and target.

From what I've (quickly) tried (several compilers, several targets), Bruce's version was either very close, or worse than yours. I haven't seen it compile as "better code" in any cases I've tried (but I haven't tried them all obviously.)

Interestingly, the following one, which, while not a one-liner, is actually (even if maybe only marginally to some) more readable, and compiles to either very similar as your version, or sligthly better - notably with ARM targets, with one fewer instruction.

Code: [Select]
char DigitToHex(unsigned char digit)
{
    if (digit < 10)
        return '0' + digit;
   
    return 'a' + digit - 10;
}

(Put it in function form, as it's the easiest to compare when compiling out of context.)
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4028
  • Country: nz
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #168 on: June 09, 2023, 05:41:31 am »
Seems to me the following should often be shorter/faster code if translated literally -- except when compilers are too smart for their own good
Code: [Select]
hex_char = digit + '0';
hex_char += (hex_char <= '9') ? 0 : 'a' - '0' - 10;
Why would you think so?  Because it's the best one on current clang trunk on rv32gc with -O2 and -Os optimization levels?

Nope, I didn't even look at what current compilers happen to generate -- it's just from first principles of having the fewest number of live variables, purely a structural property of the source code.

You can do the if-based version even on a single-accumulator machine without saving A off to X/Y/stack/RAM and restoring it later.

Code: [Select]
    lda digit
    clc
    adc #'0'
    cmp #'9'+1
    bcs done
    adc #'A'-'0'-10
done:

 

Offline JPortici

  • Super Contributor
  • ***
  • Posts: 3461
  • Country: it
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #169 on: June 09, 2023, 08:28:42 am »
It is optimization at the wrong level: if you want specific instructions, use extended assembly; otherwise, enable optimizations and let the compiler deal with it.  Doing it your way, you'll have to pore through your 'optimizations' for every single compiler and compiler version and instruction set architecture separately, and edit to get the desired output, as otherwise you have very unoptimal implementations.

 :clap: :clap: :clap: :clap: :clap: :clap: :clap: :clap: :clap: :clap: :clap: :clap: :clap: :clap: :clap:

Like i've been saying for years. Write Clear, Understandable C, let the compiler do its job.
Or you'll have to use the imperium programming language because it turns out that C is not a good language for microcontrollers

The whole premise for this thread has been amusing
 
The following users thanked this post: newbrain, MMMarco

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6236
  • Country: fi
    • My home page and email address
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #170 on: June 09, 2023, 03:26:22 pm »
Seems to me the following should often be shorter/faster code if translated literally -- except when compilers are too smart for their own good
Code: [Select]
hex_char = digit + '0';
hex_char += (hex_char <= '9') ? 0 : 'a' - '0' - 10;
Why would you think so?  Because it's the best one on current clang trunk on rv32gc with -O2 and -Os optimization levels?
Nope, I didn't even look at what current compilers happen to generate -- it's just from first principles of having the fewest number of live variables, purely a structural property of the source code.
I don't see why that principle would apply to such simple statements – but then again, I don't have a background in compiler and instruction set design; I know you do.

There is a some logic behind my practical experimentation, though: to keep the expressions in a form that gives the compiler the most opportunities for optimization (or rather, most choice in how to implement the expressions in machine code).  If we consider the C standard abstract machine model, it means not splitting statements that have no observable side effects yields more opportunities for optimization for the compiler.  Current C compilers use some form of an abstract syntax tree, and apply optimizations that do not move observable side effects directly to the abstract syntax tree; with typical widely used patterns most likely to be effectively optimized.  This leads to an expectation that for example result=(c1)?t1:(c2)?t2:(c3)?t3:f; can be easier for the compiler to optimize than the corresponding if (c1) result=t1; else if (c2) result=t2; else if (c3) result=t3; else result=f; is, because the conditional operator expression form has only a single assignment.

It seems to me that with optimizations enabled (-Os or -O2 or equivalent), that tends to apply.  That is, compilers good at optimizing will optimize both to the same machine code for a given architecture; but compilers that don't do optimization too well for a given architecture (Clang for example tends to generate a lot of code at higher optimization levels) tend to deal better with the conditional operator form compared to the if-else-chain.

It is somewhat similar to how these compilers on some architectures generate better code when using pointers, and others when using array indexing.
There are architectures like x86-64 that have (limited) native base+offset data addressing, and architectures that do not.
Array indexing code can be optimized for both, although it is harder to optimize for the no-base+offset data addressing architectures.
The pointer-using code doing the same task is very difficult to transform or optimize into base+offset data addressing form.
Examining these compilers and various cases like strstr()-type "find needle in haystack", I've found that when you have only one varying offset to an array within a loop, the array form tends to generate better code; but when you have two or more different offsets, the pointer form tends to generate better code.

(And by "better code", I don't mean "always the optimal form", I do mean the least silly code on any architecture I am interested in.)

If you have different experience and/or reasoning, I for one would appreciate to hear about it.

You can do the if-based version even on a single-accumulator machine without saving A off to X/Y/stack/RAM and restoring it later.
Why wouldn't the compiler generate the same for the conditional operator expression?  There are no sequence points or side effects that requires the compiler to do two separate operations from a single conditional operator expression, it can freely simplify the logic.

I don't think writing C as if it was a macro assembler makes sense today, when even sdcc can optimize the code it generates.  In the cases one uses an non-optimizing C compiler, I don't think the small differences of execution time matter anyway.
 

Offline profdc9

  • Frequent Contributor
  • **
  • Posts: 319
  • Country: us
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #171 on: June 09, 2023, 05:05:00 pm »
I have some alternatives I typically use for libc functions that use relatively few resources, some of which I've written or modified.

https://github.com/profdc9/VNA/blob/master/VNA/mini-printf.c
https://github.com/profdc9/VNA/blob/master/VNA/mini-printf.h

This is a small printf implementation that I have modified to include conditionally compilable floating point support.  It also has a separate built in itoa/ftoa like functions.

https://github.com/profdc9/VNA/tree/master/VNA/tinycl.cpp
https://github.com/profdc9/VNA/tree/master/VNA/tinycl.h

This is a very tiny command line parser that I wrote and use for embedded projects that typically uses less than 1k of flash and < 100 bytes of RAM (enough for one command line).

https://github.com/profdc9/VNA/blob/master/VNA/consoleio.cpp
https://github.com/profdc9/VNA/blob/master/VNA/consoleio.h

Some helper functions to adapt to various environment (for example adapt output to Arduino or whatnot).

https://github.com/profdc9/VNA/tree/master/VNA/structconf.cpp
https://github.com/profdc9/VNA/tree/master/VNA/structconf.h

Allows easy setting of a configuration structure using text strings pointing to the fields in the structure. 

All this stuff (except for the mini-printf which looks like MIT/BSD) is under the zlib license which is very permissive for embedded use.




 

Offline AVI-crak

  • Regular Contributor
  • *
  • Posts: 125
  • Country: ru
    • Rtos
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #172 on: June 12, 2023, 01:13:32 am »
The fun fact of adding new on top of old legacy.
For cortex and risc-v multiplication "u64=u32*u32" is faster than "u32=u32*u32".
I checked!!!
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14440
  • Country: fr
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #173 on: June 12, 2023, 01:18:01 am »
Sure.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4028
  • Country: nz
Re: stdlib + MCU haters: std itoa() uses less resources!
« Reply #174 on: June 12, 2023, 02:46:27 am »
The fun fact of adding new on top of old legacy.
For cortex and risc-v multiplication "u64=u32*u32" is faster than "u32=u32*u32".
I checked!!!

I don't know how you could establish such a thing, given that both "cortex" and "risc-v" cover many different cores from the likes of the 32 bit Cortex-M0 to big out-of-order 64 bit cores (and pretty big OoO 32 bit A15 too). In the case of RISC-V, many smaller cores don't even have a multiply instruction e.g. CH32V003, and even if they have a multiply instruction its execution speed is arbitrary and could be at least any of 1, 4, 8, or 32 cycles or other values.

Cortex-M0 has hardware 32x32->32 multiply (that usually takes 1 cycle), but doesn't have any instruction that directly calculates the upper 32 bits, so a 64 bit result will necessarily take much longer than a 32 bit one.

On 32 bit RISC-V, assuming you have the M extension, there is an instruction for the bottom 32 bits of the result, and separate instructions for the top 32 bits (depending on whether signed, unsigned, or a mix). A 64 bit result will almost certainly take twice as long as a 32 bit one.  On 64 bit RISC-V there are instructions for both 32x32->32 and 64x64->64 multiply.

I'm really struggling to see any circumstance in which a 32x32->64 multiply could be faster than a 32x32->32 one.
« Last Edit: June 12, 2023, 09:01:23 am by brucehoult »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf