Electronics > Microcontrollers

stdlib + MCU haters: std itoa() uses less resources!

<< < (36/37) > >>

westfw:


Quote from: brucehoult on Yesterday at 07:46:27 pm

Cortex-M0 has hardware 32x32->64 multiply (that usually takes 1 cycle), but doesn't have any instruction that directly calculates the upper 32 bits, so a 64 bit result will necessarily take much longer than a 32 bit one.

--- End quote ---


I don't understand what that sentence is supposed to mean.
CM0 only has 32*32=32bit multiply.
From the v6m ARM ARM:


--- Quote ---The only multiply instruction supported in ARMv6-M performs a 32x32 multiply that generates a 32-bit
result, see MUL on page A6-159. The instruction can operate on signed or unsigned quantities.
--- End quote ---

brucehoult:

--- Quote from: westfw on June 12, 2023, 08:15:40 am ---

Quote from: brucehoult on Yesterday at 07:46:27 pm

Cortex-M0 has hardware 32x32->64 multiply (that usually takes 1 cycle), but doesn't have any instruction that directly calculates the upper 32 bits, so a 64 bit result will necessarily take much longer than a 32 bit one.

--- End quote ---


I don't understand what that sentence is supposed to mean.
CM0 only has 32*32=32bit multiply.

--- End quote ---

Sorry, typo. As the rest of the sentence "doesn't have any instruction that directly calculates the upper 32 bits" makes clear, I meant 32x32->32.

SiliconWizard:

--- Quote from: brucehoult on June 12, 2023, 02:46:27 am ---I'm really struggling to see any circumstance in which a 32x32->64 multiply could be faster than a 32x32->32 one.

--- End quote ---

There isn't. At least not on a 32-bit processor, that just wouldn't make any sense.
Possibly on some odd 64-bit processor (that I do not know) that could be possible, say if they favored the 32x32->64 multiply and the ISA was such that getting the low 32-bit part would require an additional instruction.

Of course, which such an assertion above, we may be all the way back to the original post, a questionable analysis of a very particular case in a particular context with a particular compiler.

Nominal Animal:

--- Quote from: SiliconWizard on June 12, 2023, 07:43:18 pm ---a questionable analysis of a very particular case in a particular context with a particular compiler.
--- End quote ---
Yup.  Microbenchmarking becomes the more difficult the shorter the target code is and the fewer clock cycles it takes, especially so on architectures with caches or multiple execution units or arithmetic-logic units: the surrounding code, including the elapsed time measurement, affects the target code too much for the measurements to be useful.  At some point, it becomes nonsensical, because the measurement uncertainty and biases exceeds the target duration.

Much better – but still microbenchmarking – is to implement a function in more than one way, and run them in loops with precomputed inputs, and measure the time taken to handle all inputs.  This is then repeated a few thousand times, and the durations recorded.  The most useful measure is not average (because the error in timing under normal operating systems is always positive: the task may be interrupted by other stuff) but median (or some other percentile).  The minimum is only academically interesting, in the sense that it is the time taken "when all stars happen to align"; an unrealistic minimum that may occur, but cannot relied upon to occur.  Median is an easy one to explain: in half the cases, the time taken was at most median.

Proper benchmarking involves taking a real world task and data set, and processing it using different implementations.  Then, of course, you don't benchmark a single operation, but the implementations.

Premature optimization, like trying to make the fastest atoi() you can before making sure it is a limiting bottleneck in your task at hand, is an extremely common mistake, especially among programmers without sufficiently wide experience: they spend a lot of time on "optimizing" something that has no effect on the end result, essentially wasting valuable time.  Most often, true optimization avoids having to do that thing altogether, and achieves an order of magnitude greater savings.

An example of that is reading lots of unsorted data from storage, when you need it in order, with a human waiting for the operation to complete.  You can discuss sort algorithms how much you want, but instead of reading all data and then sorting it, you can get the task done faster (using slightly more computing resources) by sorting the data as it becomes available, for example by reading each data entry into a binary heap or a (balanced) tree.  This is less important now with extremely fast SSD drives, but with e.g. SD cards (often used for removable storage on microcontrollers and embedded devices, even on phones) and other storage media with limited transfer rates, the insertion (online sorting new entry into the data structure) takes place during time which otherwise would be wasted waiting for new data to arrive.  The end result on these slower media is that even though you end up using more CPU cycles, the data is sorted and ready basically as soon as it all of it has been read, whereas the fastest offline sorting algorithm is only just starting its work at that point.

SiliconWizard:

--- Quote from: Nominal Animal on June 12, 2023, 08:49:33 pm ---Premature optimization, like trying to make the fastest atoi() you can before making sure it is a limiting bottleneck in your task at hand, is an extremely common mistake, especially among programmers without sufficiently wide experience: they spend a lot of time on "optimizing" something that has no effect on the end result, essentially wasting valuable time.  Most often, true optimization avoids having to do that thing altogether, and achieves an order of magnitude greater savings.

--- End quote ---

Yes, yes, and yes!

It's interesting to see many people navigating between these two extremes: either excessive optimization on unimportant stuff, or complete waste of resources to save a couple hours, or sometimes just actually minutes, of development time.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod