Author Topic: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE (Read 17814 times)

jpelczar · « **Reply #25 on:** January 02, 2015, 09:20:44 am »

You may also look at FreeBSD math library: https://svnweb.freebsd.org/base/head/lib/msun/src/

https://svnweb.freebsd.org/base/head/lib/msun/src/s_sin.c?revision=218509&view=markup
https://svnweb.freebsd.org/base/head/lib/msun/src/k_sin.c?revision=218509&view=markup

westfw · « **Reply #26 on:** January 02, 2015, 10:00:47 am »

Quote

FreeBSD math library

It looks to me like the newlib nano libraries and the freebsd libraries are essentially the same old Sun code.

kody · « **Reply #27 on:** January 03, 2015, 02:01:20 am »

Thanks for the links @jpelczar. Do you think the implementation would be same through different compilers/vendors?

Quote from: jpelczar on January 02, 2015, 09:20:44 am

You may also look at FreeBSD math library: https://svnweb.freebsd.org/base/head/lib/msun/src/

https://svnweb.freebsd.org/base/head/lib/msun/src/s_sin.c?revision=218509&view=markup
https://svnweb.freebsd.org/base/head/lib/msun/src/k_sin.c?revision=218509&view=markup

kody · « **Reply #28 on:** January 03, 2015, 02:03:08 am »

@dannyf- could you please expand more on how the delisting and the counting of the instruction cycles can be done?

Quote from: dannyf on December 31, 2014, 02:11:45 pm

Quote
is it not possible if we only have the compiler?

Yes. By looking at the delisting and counting up the instruction cycles.

Quote
I mean, without the actual hardware/microcontroller.

A far better approach is to get the actual hardware.

tggzzz · « **Reply #29 on:** January 03, 2015, 09:25:55 am »

Quote from: kody on January 03, 2015, 02:03:08 am

@dannyf- could you please expand more on how the delisting and the counting of the instruction cycles can be done?

Especially if the processor has a cache

andersm · « **Reply #30 on:** January 03, 2015, 11:03:11 am »

There are several options:
- Use a high-frequency counter. The simplest, and works on any chip. On many MCUs you can have timers ticking at the CPU frequency or f/2.
- Use tracing. Both the MCU and your development tools have to support this.
- Use performance counters. Many MCUs have these nowadays, and they can count instruction cycles, memory access cycles and all kinds of things.

The ARMv7-M DWT (Data Watchpoint and Trace unit) has both a cycle count timer and performance counters that can be used by software running on the MCU. Although it is optional, I would expect that all Cortex-M4 MCUs have it.

0xdeadbeef · « **Reply #31 on:** January 03, 2015, 11:20:36 am »

Quote from: tggzzz on January 03, 2015, 09:25:55 am

Especially if the processor has a cache

Obviously, a real measurement will always differ due to interrupts, pipeline effects, branch prediction, cache, waitstates, DMA blocking the bus or RAM etc.
Anyway, looking at the code will make it possible to better estimate the number of cycles needed. As stated above, even looking at C code for a "fast" tangens implementation allows to say this will need > 100 cycles. With the actual source code the prediction will be better and with the ASM code, it can be quite accurate - if you don't consider the complex runtime effects discussed above.

tggzzz · « **Reply #32 on:** January 03, 2015, 04:50:57 pm »

Quote from: 0xdeadbeef on January 03, 2015, 11:20:36 am

Quote from: tggzzz on January 03, 2015, 09:25:55 am
Especially if the processor has a cache
Obviously, a real measurement will always differ due to interrupts, pipeline effects, branch prediction, cache, waitstates, DMA blocking the bus or RAM etc.
Anyway, looking at the code will make it possible to better estimate the number of cycles needed. As stated above, even looking at C code for a "fast" tangens implementation allows to say this will need > 100 cycles. With the actual source code the prediction will be better and with the ASM code, it can be quite accurate - if you don't consider the complex runtime discussed above.

Even 20 years ago, measurements on an i486 with its tiny cache doing nothing else, there was a measured 10:1 difference between mean and maximum times. Modern processors have much faster clocks,but DRAM memory latency hasn't changed. Processors have much bigger caches and are more dependent on them to reduce the average memory latency. Naturally caches cannot change the maximum latency.

Hence the maximum:mean ratio has increased significantly and predicted execution times are even less valid than before.

Remember the truism "cache is the new RAM, RAM is the new disk"

0xdeadbeef · « **Reply #33 on:** January 03, 2015, 04:57:27 pm »

The situation is a little different on microcontrollers. Some still don't have any cache at all, some have very simple implementations, only the high end controllers have complex ones.
Generally, the internal SRAM is usually not cached. Cache is mainly needed to improve performance when running from flash. Note that fetching instructions from flash is a bottleneck for most faster microcontrollers. They usually use a burst read to fill a whole cache line but if there are are lot of branches and/or bad branch prediction, a cache miss can be a big performance hit.

mikerj · « **Reply #34 on:** January 03, 2015, 05:00:02 pm »

Quote from: kody on December 30, 2014, 11:46:43 pm

@dannyf - is it not possible if we only have the compiler? I mean, without the actual hardware/microcontroller.

You could use a simulator if one was available, but I don't think CoIDE includes this functionality? You could use the demo version of Keil etc. if you code fits into the space limitations.

tggzzz · « **Reply #35 on:** January 03, 2015, 05:56:03 pm »

Quote from: 0xdeadbeef on January 03, 2015, 04:57:27 pm

The situation is a little different on microcontrollers. Some still don't have any cache at all, some have very simple implementations, only the high end controllers have complex ones.
Generally, the internal SRAM is usually not cached. Cache is mainly needed to improve performance when running from flash. Note that fetching instructions from flash is a bottleneck for most faster microcontrollers. They usually use a burst read to fill a whole cache line but if there are are lot of branches and/or bad branch prediction, a cache miss can be a big performance hit.

Yes, as I implied in my first response.

OTOH, many microcontrollers have already surpassed the i486 in terms of cache. The current microcontroller I am using, in a Zynq FPGA, is a dual-core Arm-A9, each core having 32K=32K I+D cache. (The cheapest ARM is, IIRC costs <$1) That trend will continue, although there will always be some MCUs that don't have/need cache.

More interestingly, some actively avoid cache due to its "poor" behaviour in hard realtime systems, e.g. the very small and cheap XMOS processors with 2-10 cores. http://www.digikey.co.uk/product-search/en/integrated-circuits-ics/embedded-microcontrollers/2556109?k=xmos

Those XMOS processors are the only ones I know where the compiler/IDE guarantees the execution time. With all other processors, all bets are off.

gmb42 · « **Reply #36 on:** January 03, 2015, 07:12:56 pm »

In a post, here, RedHat explain how they've improved the performance of some math functions in glibc. Eventually I suppose these will filter down to newlib\nanolib etc.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE (Read 17814 times)

jpelczar

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

westfw

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

kody

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

kody

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

tggzzz

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

andersm

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

0xdeadbeef

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

tggzzz

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

0xdeadbeef

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

mikerj

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

tggzzz

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

gmb42

Re: How are math.h library functions like sqrt, asinf, tanf implemented in CoIDE

Share me