### Author Topic: Calling ASM functions in C  (Read 13171 times)

0 Members and 1 Guest are viewing this topic.

#### Psi

• Super Contributor
• Posts: 9837
• Country:
##### Calling ASM functions in C
« on: April 28, 2012, 11:16:22 am »
Hi,  for a AVRGCC project on an ATMega i need to do some fast integer division.

The bit shift method for divide by 2/4/8 etc wont work for this as the values are only known at runtime and can be any value.
I've been having a look at the official Atmel app notes on division which provides size/speed optimized sample code in ASM for various operations.
http://www.atmel.com/Images/doc0936.pdf
http://www.atmel.com/Images/AVR200.zip

But i'm not sure how to integrate the ASM into my C program.

Here's one of the ASM functions from the app note
Code: [Select]
;***************************************************************************
;*
;* "div16u" - 16/16 Bit Unsigned Division
;*
;* This subroutine divides the two 16-bit numbers
;* "dd8uH:dd8uL" (dividend) and "dv16uH:dv16uL" (divisor).
;* The result is placed in "dres16uH:dres16uL" and the remainder in
;* "drem16uH:drem16uL".
;*
;* Number of words :196 + return
;* Number of cycles :148/173/196 (Min/Avg/Max)
;* Low registers used :2 (drem16uL,drem16uH)
;* High registers used  :4 (dres16uL/dd16uL,dres16uH/dd16uH,dv16uL,dv16uH)
;*
;***************************************************************************

;***** Subroutine Register Variables

.def drem16uL=r14
.def drem16uH=r15
.def dres16uL=r16
.def dres16uH=r17
.def dd16uL =r16
.def dd16uH =r17
.def dv16uL =r18
.def dv16uH =r19

;***** Code

div16u: clr drem16uL ;clear remainder Low byte
sub drem16uH,drem16uH;clear remainder High byte and carry

rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_1 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_2 ;else
d16u_1: sec ;    set carry to be shifted into result

d16u_2: rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_3 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_4 ;else
d16u_3: sec ;    set carry to be shifted into result

d16u_4: rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_5 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_6 ;else
d16u_5: sec ;    set carry to be shifted into result

d16u_6: rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_7 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_8 ;else
d16u_7: sec ;    set carry to be shifted into result

d16u_8: rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_9 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_10 ;else
d16u_9: sec ;    set carry to be shifted into result

d16u_10:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_11 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_12 ;else
d16u_11:sec ;    set carry to be shifted into result

d16u_12:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_13 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_14 ;else
d16u_13:sec ;    set carry to be shifted into result

d16u_14:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_15 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_16 ;else
d16u_15:sec ;    set carry to be shifted into result

d16u_16:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_17 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_18 ;else
d16u_17: sec ;    set carry to be shifted into result

d16u_18:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_19 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_20 ;else
d16u_19:sec ;    set carry to be shifted into result

d16u_20:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_21 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_22 ;else
d16u_21:sec ;    set carry to be shifted into result

d16u_22:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_23 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_24 ;else
d16u_23:sec ;    set carry to be shifted into result

d16u_24:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_25 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_26 ;else
d16u_25:sec ;    set carry to be shifted into result

d16u_26:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_27 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_28 ;else
d16u_27:sec ;    set carry to be shifted into result

d16u_28:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_29 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_30 ;else
d16u_29:sec ;    set carry to be shifted into result

d16u_30:rol dd16uL ;shift left dividend
rol dd16uH
rol drem16uL ;shift dividend into remainder
rol drem16uH
sub drem16uL,dv16uL ;remainder = remainder - divisor
sbc drem16uH,dv16uH ;
brcc d16u_31 ;if result negative
clc ;    clear carry to be shifted into result
rjmp d16u_32 ;else
d16u_31:sec ;    set carry to be shifted into result

d16u_32:rol dd16uL ;shift left dividend
rol dd16uH
ret

Can someone point me in the right direction for encapsulating that into something i can call directly in C.
I've not done ASM since x86 in Uni and i seem to remember you have to be careful to restore all the cpu register values you have used to there original state when you finish your ASM code.

« Last Edit: April 28, 2012, 11:19:41 am by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)

#### alm

• Guest
##### Re: Calling ASM functions in C
« Reply #1 on: April 28, 2012, 11:23:56 am »
How much faster is this algorithm compared to the code produced by GCC? I would also expect it to produce something similar with optimizations enabled. You can look at the generated assembly or count cycles on the AVR simulator.

I'm sure you'll find tutorials in the avr-gcc docs and on avrfreaks.net.

#### Psi

• Super Contributor
• Posts: 9837
• Country:
##### Re: Calling ASM functions in C
« Reply #2 on: April 28, 2012, 11:27:04 am »
How much faster is this algorithm compared to the code produced by GCC? I would also expect it to produce something similar with optimizations enabled. You can look at the generated assembly or count cycles on the AVR simulator.

I'm sure you'll find tutorials in the avr-gcc docs and on avrfreaks.net.

I've read that div and ldiv in avrgcc stdlib.h aren't optimized
Greek letter 'Psi' (not Pounds per Square Inch)

#### amspire

• Super Contributor
• Posts: 3802
• Country:
##### Re: Calling ASM functions in C
« Reply #3 on: April 28, 2012, 12:42:17 pm »
How much faster is this algorithm compared to the code produced by GCC? I would also expect it to produce something similar with optimizations enabled. You can look at the generated assembly or count cycles on the AVR simulator.

I'm sure you'll find tutorials in the avr-gcc docs and on avrfreaks.net.

I've read that div and ldiv in avrgcc stdlib.h aren't optimized

It is worth testing the GCC performance anyway. If you have division operations in your code using constants, GCC actually can optimize the code for the particular constant.

For example if my_variable is an unsigned long, it can work out that "my_variable) / 0x01000000L" means take the top byte and ignore the rest. So the long division is optimized down to a 1 cycle "mov" instruction if my_variable and the destination were already loaded in the registers.

Richard.

#### Psi

• Super Contributor
• Posts: 9837
• Country:
##### Re: Calling ASM functions in C
« Reply #4 on: April 28, 2012, 12:54:06 pm »
yeah, sadly this isn't a constant.

There are a few places i need a fast divide, but one is correcting an ADC value for a sensor which isn't always full adc scale (0-5V).
It has to be field adjustable so, for example, if the sensor min is 0.6V and max 4.5V this can be corrected for in code to produce a 0-255 value corresponding to min-max, eg

{
{
div_struct = ldiv(  (  (uint16_t)(adc_data) - minval)   *   255 )      ,   (maxval - minval));
sensordata = div_struct.quot;
}
else
{
sensordata = 255;
}
}
else
{
sensordata = 0;
}

Greek letter 'Psi' (not Pounds per Square Inch)

#### PeterG

• Frequent Contributor
• Posts: 830
• Country:
##### Re: Calling ASM functions in C
« Reply #5 on: April 28, 2012, 01:14:38 pm »
This site should send you in the rite direction.

http://www.nongnu.org/avr-libc/user-manual/inline_asm.html

Regards
Testing one two three...

#### amspire

• Super Contributor
• Posts: 3802
• Country:
##### Re: Calling ASM functions in C
« Reply #6 on: April 28, 2012, 01:16:48 pm »
If you have some RAM free, the fastest way to do the correction is to have the micro generate a correction table when it starts, and then the correction is performed with the lookup table rather then calculations. If you can truncate the A/D to 9 bits, then you would need 512 bytes of ram.

I haven't combined assembler with C++ but I know you can do it - there are plenty of sites with examples for the Arduino.

This may be what you are looking for:

http://www.nongnu.org/avr-libc/user-manual/inline_asm.html

Edit: PeterG beat me to it.

Here is an example:

http://blog.threebytesfull.com/post/6475041619/arduino-avr-assembly-language

I suspect that when you call your 16 bit div code program, you will have to also include push and pop for r14 to r19 plus the status register. What is that - 28 extra cycles?

Richard.

#### alm

• Guest
##### Re: Calling ASM functions in C
« Reply #7 on: April 28, 2012, 01:56:20 pm »
I've read that div and ldiv in avrgcc stdlib.h aren't optimized
So without doing any measurements, and based on possibly outdated information, you decide that there is a problem that needs solving? You're the kind of programmer that creates all kinds of unnecessary hacks that introduce bugs and complicate maintenance to solve imaginary performance problems?

Do some measurements and see if the performance of the div operation (any reason why you don't use the / operator?) is a problem. Then determine how many cycles it takes in the code produced by GCC compared to the performance claimed by the Atmel AN, and decide if the extra complexity is worth it. This AVRfreaks thread suggests that GCC performance is better than the AN200 appnote, for example.

#### Psi

• Super Contributor
• Posts: 9837
• Country:
##### Re: Calling ASM functions in C
« Reply #8 on: April 28, 2012, 02:07:30 pm »
I've read that div and ldiv in avrgcc stdlib.h aren't optimized
So without doing any measurements, and based on possibly outdated information, you decide that there is a problem that needs solving? You're the kind of programmer that creates all kinds of unnecessary hacks that introduce bugs and complicate maintenance to solve imaginary performance problems?

The performance issue is not imaginary, some of the functions using ldiv are running quite slow at present so i wanted to try the avr200 asm functions and see if they're any better.
« Last Edit: April 28, 2012, 02:09:46 pm by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)

#### Psi

• Super Contributor
• Posts: 9837
• Country:
##### Re: Calling ASM functions in C
« Reply #9 on: April 28, 2012, 02:13:24 pm »
If you have some RAM free, the fastest way to do the correction is to have the micro generate a correction table when it starts, and then the correction is performed with the lookup table rather then calculations. If you can truncate the A/D to 9 bits, then you would need 512 bytes of ram.

I haven't combined assembler with C++ but I know you can do it - there are plenty of sites with examples for the Arduino.

This may be what you are looking for:

http://www.nongnu.org/avr-libc/user-manual/inline_asm.html

Edit: PeterG beat me to it.

Here is an example:

http://blog.threebytesfull.com/post/6475041619/arduino-avr-assembly-language

I suspect that when you call your 16 bit div code program, you will have to also include push and pop for r14 to r19 plus the status register. What is that - 28 extra cycles?

Richard.

thanks, some useful info there.

I want to try and avoid a lookup table unless it turns out be the only way.  I'm only using the ADC in 8bit mode and have enough free ram for it.

Greek letter 'Psi' (not Pounds per Square Inch)

#### TerminalJack505

• Super Contributor
• Posts: 1310
• Country:
##### Re: Calling ASM functions in C
« Reply #10 on: April 28, 2012, 02:33:04 pm »
I'd be curious to see if you can outsmart the compiler.  I did this exact same exercise (division) and the compiler bet me by 2 cycles, if I remember correctly.

You really should be using the '/' operator.  The only reason to use div() or ldiv(), that I can think of, is if you intend to use both the quotient and the remainder--in which case it is faster than the two '/' and '%' operations.

The avr-libc user's manual should explain how to call your assembler routine.

#### AntiProtonBoy

• Frequent Contributor
• Posts: 988
• Country:
• I think I passed the Voight-Kampff test.
##### Re: Calling ASM functions in C
« Reply #11 on: April 28, 2012, 02:49:10 pm »
Are maxval and minval constants? If so, you could pre-compute the reciprocal of the (maxval - minval) term and use fixed point arithmetic.

//Define this as global
const uint16_t N = (uint16_t)(255.0f  /  (float)(maxval - minval));

{
{
uint16_t Result  = adc_data - minval;
Result = (Result * N) >> 8;
sensordata = (uint8_t)Result;
}
else
{
sensordata = 255;
}
}
else
{
sensordata = 0;
}

This is just a quick brain dump, not tested. I'm assuming that all variables are uint8_t integer types, suitable for an 8-bit architecture.

What is the native integer size for the processor?
« Last Edit: April 28, 2012, 02:51:50 pm by AntiProtonBoy »

#### bingo600

• Super Contributor
• Posts: 1963
• Country:
##### Re: Calling ASM functions in C
« Reply #12 on: April 28, 2012, 07:43:01 pm »
I'm quite sure the elm-chan routines here are for avr-gcc

http://elm-chan.org/cc_e.html

Scroll down to : AVR assembler libraries

/Bingo

#### nctnico

• Super Contributor
• Posts: 26575
• Country:
##### Re: Calling ASM functions in C
« Reply #13 on: April 28, 2012, 08:36:24 pm »
The best thing to do is to avoid divisions. Atmega has multiply instructions so it will be a lot faster to multiply by the reciprocal value and divide (shift) by a power of 2.

If you want to divide by 333
First calculate (2^16) / 333 = 197

3330 / 333=10
(3330 * 197) >>16 = 10

You have to watch out for overflows though. Its best to use 32 bit types or choose the power of 2 factor carefully.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.

#### Psi

• Super Contributor
• Posts: 9837
• Country:
##### Re: Calling ASM functions in C
« Reply #14 on: April 29, 2012, 12:23:55 am »
The best thing to do is to avoid divisions. Atmega has multiply instructions so it will be a lot faster to multiply by the reciprocal value and divide (shift) by a power of 2.

If you want to divide by 333
First calculate (2^16) / 333 = 197

3330 / 333=10
(3330 * 197) >>16 = 10

You have to watch out for overflows though. Its best to use 32 bit types or choose the power of 2 factor carefully.

Yeah, for my above example that will work because min/max are read out of eeprom at poweron, so the reciprocal can be calculated once at that point. Thanks

But i can't see it working in other areas of my code, such as this table X/Y interpolate function.
As the variables change in realtime

Code: [Select]
// (T,B,L,R = top/bot/left/right)
uint8_t *rowlabel1, uint8_t *rowlabel2, uint8_t *collabel1, uint8_t *collabel2,
uint8_t *inputrow, uint8_t *inputcol)
{
div_t divout;

colstep = *collabel2-*collabel1;
rowstep = *rowlabel2-*rowlabel1;
temp1 = *inputcol - *collabel1;

return (quadCAB - (divout.quot * (*inputrow - *rowlabel1)));
}

« Last Edit: April 29, 2012, 12:34:55 am by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)

#### Psi

• Super Contributor
• Posts: 9837
• Country:
##### Re: Calling ASM functions in C
« Reply #15 on: April 29, 2012, 12:28:14 am »
I'd be curious to see if you can outsmart the compiler.  I did this exact same exercise (division) and the compiler bet me by 2 cycles, if I remember correctly.

You really should be using the '/' operator.  The only reason to use div() or ldiv(), that I can think of, is if you intend to use both the quotient and the remainder--in which case it is faster than the two '/' and '%' operations.

Interesting, i wasn't aware of that, i assumed using a / on all integers resulted in a floating point divide with the result truncated and stored as integer.

Thanks, that cleared some stuff up.

Greek letter 'Psi' (not Pounds per Square Inch)

#### IanB

• Super Contributor
• Posts: 11734
• Country:
##### Re: Calling ASM functions in C
« Reply #16 on: April 29, 2012, 12:38:17 am »
Interesting, i wasn't aware of that, i assumed using a / on all integers resulted in a floating point divide with the result truncated and stored as integer.

Absolutely not. Integer division is much easier to perform than floating point division.

(One exception is on certain hardware like the IBM RS6000 where double precision division is faster than integer division. In that case the compiler will optimize integer divides by promoting them to floating point divides and truncating. But that is not usually the case on most hardware.)

#### TerminalJack505

• Super Contributor
• Posts: 1310
• Country:
##### Re: Calling ASM functions in C
« Reply #17 on: April 29, 2012, 12:49:51 am »
Thankfully, the compiler will do integer math so long as both operands are integers.

Doing just one floating point operation anywhere in your program will link in the floating point library.  This is about 4 or 5k, if I remember correctly, so you want to be careful about that.

#### alm

• Guest
##### Re: Calling ASM functions in C
« Reply #18 on: April 29, 2012, 12:56:58 am »
The defaulting to integer division can also give unexpected effects, for example:
Code: [Select]
int i = 3;
double f = i / 2;
results in f = 1.0, not 1.5 as you might expect. i / 2. forces the float division.

Floating point operations bloat the code size on micros without hardware FP, unless they can be completely calculated at compile time. This is why _delay_us(2.5) does not require any floating point code. The compiler calculates the required number of iterations at compile time, as long as the argument is a constant. This is why the documentation contains warnings about calling this function with a variable argument, since it forces the compiler to do FP calculations at run time, with the associated size and performance penalty.

#### Psi

• Super Contributor
• Posts: 9837
• Country:
##### Re: Calling ASM functions in C
« Reply #19 on: April 29, 2012, 01:04:10 am »
Yeah, i have another function that currently does use floating point math which i need to optimized to integer, but that's a separate issue.
Greek letter 'Psi' (not Pounds per Square Inch)

#### IanB

• Super Contributor
• Posts: 11734
• Country:
##### Re: Calling ASM functions in C
« Reply #20 on: April 29, 2012, 01:06:38 am »
results in f = 1.0, not 1.5 as you might expect

It's important not to expect that. The difference between integer and floating point division should perhaps be one of the first things you learn in programming class. I remember it was for me.

The reason why you can't expect a floating point division is that the left hand side of an assignment never influences the evaluation of what is on the right hand side. It's a golden rule.

#### alm

• Guest
##### Re: Calling ASM functions in C
« Reply #21 on: April 29, 2012, 01:23:56 am »
A golden rule in C at least. Don't expect this to be universal across programming languages. One exception that comes to mind is Perl, which has the concept of the context of a function/operator. I'm not aware of anyone using Perl on 8-bit micros, though .

#### IanB

• Super Contributor
• Posts: 11734
• Country:
##### Re: Calling ASM functions in C
« Reply #22 on: April 29, 2012, 01:27:57 am »
A golden rule in C at least. Don't expect this to be universal across programming languages. One exception that comes to mind is Perl, which has the concept of the context of a function/operator. I'm not aware of anyone using Perl on 8-bit micros, though .

OK, you got me. I'm a Perl fan and capable Perl programmer.

But it is true in C family languages, and in Fortran. Maybe it's not a golden rule but a silver rule?

#### alm

• Guest
##### Re: Calling ASM functions in C
« Reply #23 on: April 29, 2012, 01:42:33 am »
I think at the very least we can agree that it's important to know whether rule applies in your current programming language. Both the concept of context (eg. calling a function in list context Perl) and the lack of context (expecting a float division because the variable on the left hand side is a float) can easily cause confusion and seemingly obscure bugs.

#### TerminalJack505

• Super Contributor
• Posts: 1310
• Country:
##### Re: Calling ASM functions in C
« Reply #24 on: April 29, 2012, 01:54:53 am »
Wow!  I'm a big Perl fan as well.

Three of us right here.  What are odds?

You actually don't see many Perl coders anymore.  Python seems to have stole its thunder.

Smf