Author Topic: AVR compiler that does a good job optimizing for size?  (Read 15215 times)

0 Members and 1 Guest are viewing this topic.

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
AVR compiler that does a good job optimizing for size?
« on: April 29, 2015, 03:48:54 am »
My expectations that the new ipa optimizer in GCC 5 would generate code closer to hand-written assembler were in vain.  Here's one example of how avr-gcc does a terrible job:
https://gcc.gnu.org/ml/gcc/2015-04/msg00361.html

Has anyone tried avr-llvm?
https://github.com/avr-llvm/llvm

Is there any other free compiler for 8-bit AVR that does a better job than GCC?
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Online T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21681
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: AVR compiler that does a good job optimizing for size?
« Reply #1 on: April 29, 2015, 04:13:21 am »
Does it go away with -O4 instead of Os?

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline andersm

  • Super Contributor
  • ***
  • Posts: 1198
  • Country: fi
Re: AVR compiler that does a good job optimizing for size?
« Reply #3 on: April 29, 2015, 05:22:29 am »
It's worth reporting those kinds of issues, or at least posting to the gcc-help mailing list.

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #4 on: April 29, 2015, 11:55:01 am »

 Flash is cheap these days, most people just move up to the next device rather than spend a lot of money on a compiler to save 5-10%.
That doesn't help when you're trying to write code for millions of mega328's out there already.
I started writing 8-bit assembler over 30 years ago. It surprises me that compilers still can't match what a programmer can do in assembler after all this time.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline c4757p

  • Super Contributor
  • ***
  • Posts: 7799
  • Country: us
  • adieu
Re: AVR compiler that does a good job optimizing for size?
« Reply #5 on: April 29, 2015, 12:09:44 pm »
I started writing 8-bit assembler over 30 years ago. It surprises me that compilers still can't match what a programmer can do in assembler after all this time.

A massive amount of work for small benefit. They're good enough that it doesn't matter unless you're cycle-pinching (or if you're so phenomenally bad at C that it comes out shit but it's your fault), chips with enough power not to need cycle-pinching are cheap enough these days that most people don't have to bother, and any compiler that could achieve hand assembly levels of optimization would have to be so closely coupled to its platform that you'd be practically writing a new compiler for every chip.

The very few of us who need hand-assembly efficiency can still just write assembly by hand.
No longer active here - try the IRC channel if you just can't be without me :)
 

Offline engineer_in_shorts

  • Regular Contributor
  • *
  • Posts: 122
  • Country: gb
Re: AVR compiler that does a good job optimizing for size?
« Reply #6 on: April 29, 2015, 12:58:19 pm »

 Flash is cheap these days, most people just move up to the next device rather than spend a lot of money on a compiler to save 5-10%.
That doesn't help when you're trying to write code for millions of mega328's out there already.
I started writing 8-bit assembler over 30 years ago. It surprises me that compilers still can't match what a programmer can do in assembler after all this time.

Try IAR EWAVR
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #7 on: April 29, 2015, 01:24:44 pm »
I started writing 8-bit assembler over 30 years ago. It surprises me that compilers still can't match what a programmer can do in assembler after all this time.
any compiler that could achieve hand assembly levels of optimization would have to be so closely coupled to its platform that you'd be practically writing a new compiler for every chip.

When I've looked at GCC's x86 code, it's been pretty close to what I could write myself, so I doubt it's a matter of GCC's GIMPLE being too restrictive.  I suspect the real reason is the x86 (and probably ARM ) back-end gets more attention than the AVR back-end.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #8 on: April 29, 2015, 01:28:34 pm »

 Flash is cheap these days, most people just move up to the next device rather than spend a lot of money on a compiler to save 5-10%.
That doesn't help when you're trying to write code for millions of mega328's out there already.
I started writing 8-bit assembler over 30 years ago. It surprises me that compilers still can't match what a programmer can do in assembler after all this time.

Try IAR EWAVR

I read in Avrfreaks where someone was using IAR and getting consistently bigger binaries than avr-gcc-4.3.  GCC-4.4 through 4.7 seemed to get worse for AVR code generation, but now 4.9.2 with lto generates even smaller code than 4.3.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline nuno

  • Frequent Contributor
  • **
  • Posts: 606
  • Country: pt
Re: AVR compiler that does a good job optimizing for size?
« Reply #9 on: April 29, 2015, 02:22:04 pm »
There are a few simple things you can do in the program that will reduce code size. For example, have you set to "static" the functions and globals you don't use in more than one file?
 

Online T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21681
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #11 on: April 30, 2015, 01:51:42 pm »
I suppose you have to consider that the AVR target doesn't get as much love as x86 from the developers, for obvious reasons. Also be wary of assuming that the compiler is being dumb, like that link you posted. Often it will create what looks like pointless code, until you find that some other bit of code jumps into the middle of it to save repartition of what follows.

If you want the GCC guys to improve a specific issue you should come up with a minimal test case. Just main() and as little code as possible to illustrate the problem.
*I* wrote the post on the gcc list.
Ive written a lot of asm and often can cut code size by 50% compared to C.
I re-wrote optiboot in asm and got it under 256 bytes, and still have room to add eeprom programming.
https://github.com/nerdralph/picoboot/tree/master/arduino
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11631
  • Country: my
  • reassessing directives...
Re: AVR compiler that does a good job optimizing for size?
« Reply #12 on: April 30, 2015, 02:47:58 pm »
Flash is cheap these days, most people just move up to the next device rather than spend a lot of money on a compiler to save 5-10%.
That doesn't help when you're trying to write code for millions of mega328's out there already.
I started writing 8-bit assembler over 30 years ago. It surprises me that compilers still can't match what a programmer can do in assembler after all this time.
that doesnt help if you try to code Win8 GUI interface embedded system. assembly is good for efficient code, but simple application. once your application grows you'll start thinking twice. a compiler will never match hand tuned assembly because of generic contructs for the ease of programming. take few examples such as for...loop and if..endif, they are based on certain generic construct to satisfy all programming conditions, inevitably they take more than what "case by case" "hand tuned" assembly can make. i do assembly when it suits, and i do higher level language where it suits. each one with its own  pro and cons. mourning or nitpicking one way is better than the other will not help. i've made a little bit project to make pic and avr asm code portable between each other by bringing assembly a little bit higher (but still lower than most mainstream like c/vb/delphi et al), by making a giant (well not so giant, but can be expandable to giant for every programming needs, mine is only for myself) bunches of macros, some macros contructs for "loop" and "if" to make asm coding experience better for few variant of machines. i know what it takes, i dont develop a compiler or higher level language to full extents (heck hackers/makers/hobbiest/engineers dont develop compiler/parsers we have different purposes :P), but i know what it takes, i encourage you to do the same if you want to get the idea ;) cheers.
« Last Edit: April 30, 2015, 02:50:37 pm by Mechatrommer »
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline nuno

  • Frequent Contributor
  • **
  • Posts: 606
  • Country: pt
Re: AVR compiler that does a good job optimizing for size?
« Reply #13 on: April 30, 2015, 03:20:29 pm »
A pure C compiler will never be able to write as compact Assembly code as a good human programmer, simply because the C language wasn't designed to capture all the human's intentions; it doesn't know enough about the data ranges and usage, and other stuff. I've written two or three pretty compact projects for AVR using GCC in 99.9% C (sometimes a NOP or SLEEP comes handy). I usually look at the Assembly it generates and sometimes do some C code "re-ordering" that makes it generate more compact Assembly; from my experience, giving it a hand it does generate pretty compact code.
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #14 on: May 02, 2015, 10:26:21 pm »
It turns out to be a 5+ yr old bug.
gcc.gnu.org/PR41076
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13746
  • Country: gb
    • Mike's Electric Stuff
Re: AVR compiler that does a good job optimizing for size?
« Reply #15 on: May 03, 2015, 09:12:36 am »
IAR is pretty good. It supports register variables for when you really need maximum speed.

However with any compiler, the best optimiser is the  user - you can do way more by carefully considering how you write the code than any compiler can ever hope to optimise well.

You also need to review the code that's being generated, if only by looking at the size of each function,  to see if there may be something that's not being done efficiently.

Also be very cautious about using any C library function (e.g. printf), as these will often include functionality that you don't need that can add huge overheads. 
« Last Edit: May 03, 2015, 09:14:22 am by mikeselectricstuff »
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline nuno

  • Frequent Contributor
  • **
  • Posts: 606
  • Country: pt
Re: AVR compiler that does a good job optimizing for size?
« Reply #16 on: May 03, 2015, 11:00:08 am »
IAR is pretty good. It supports register variables for when you really need maximum speed.
GCC does it also
https://gcc.gnu.org/onlinedocs/gcc/Explicit-Reg-Vars.html
 

Offline Mechanical Menace

  • Super Contributor
  • ***
  • Posts: 1288
  • Country: gb
Re: AVR compiler that does a good job optimizing for size?
« Reply #17 on: May 03, 2015, 11:23:40 am »
No offence but why not use the 4.9.2 release when the 5 series is only a couple of weeks old and as always will need a couple of bug and regression fixes before it's on a par with earlier versions?
Second sexiest ugly bloke on the forum.
"Don't believe every quote you read on the internet, because I totally didn't say that."
~Albert Einstein
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #18 on: May 03, 2015, 06:40:47 pm »
No offence but why not use the 4.9.2 release when the 5 series is only a couple of weeks old and as always will need a couple of bug and regression fixes before it's on a par with earlier versions?
Why not read what I posted to the gcc list so you would know I tried 4.8, 4.9, and 5.0 and got the same code output?
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline nuno

  • Frequent Contributor
  • **
  • Posts: 606
  • Country: pt
Re: AVR compiler that does a good job optimizing for size?
« Reply #19 on: May 03, 2015, 08:40:58 pm »
I still use AVR-GCC 4.1.1, it also generates dumb code although not as much (this same code is generated with -Os, -O1, -O2 and -O3):

Code: [Select]
unsigned short f(unsigned char a, unsigned char b)
{
  4c: 99 27        eor r25, r25
  4e: 77 27        eor r23, r23
  50: 76 2f        mov r23, r22
  52: 66 27        eor r22, r22
    return a | (b << 8);
}
  54: 86 2b        or r24, r22
  56: 97 2b        or r25, r23
  58: 08 95        ret
 

Offline Mechanical Menace

  • Super Contributor
  • ***
  • Posts: 1288
  • Country: gb
Re: AVR compiler that does a good job optimizing for size?
« Reply #20 on: May 04, 2015, 12:30:06 pm »
No offence but why not use the 4.9.2 release when the 5 series is only a couple of weeks old and as always will need a couple of bug and regression fixes before it's on a par with earlier versions?
Why not read what I posted to the gcc list so you would know I tried 4.8, 4.9, and 5.0 and got the same code output?

Because your posting to the mailing list made no mention of 4.9...
Second sexiest ugly bloke on the forum.
"Don't believe every quote you read on the internet, because I totally didn't say that."
~Albert Einstein
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #21 on: May 04, 2015, 05:50:04 pm »
No offence but why not use the 4.9.2 release when the 5 series is only a couple of weeks old and as always will need a couple of bug and regression fixes before it's on a par with earlier versions?
Why not read what I posted to the gcc list so you would know I tried 4.8, 4.9, and 5.0 and got the same code output?

Because your posting to the mailing list made no mention of 4.9...

Ok, sorry for the sarcasm then.
Although if you also look at the PR, you'll see that it's never been fixed, and therefore exists in 4.9.x.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline andersm

  • Super Contributor
  • ***
  • Posts: 1198
  • Country: fi
Re: AVR compiler that does a good job optimizing for size?
« Reply #22 on: May 05, 2015, 06:48:33 am »
Use -O1 in the compiler and -Os in the linker.
From the ld documentation:
Quote
At the moment this option only affects ELF shared library generation. Future releases of the linker may make more use of this option. Also currently there is no difference in the linker's behaviour for different non-zero values of this option. Again this may change with future releases.
Ie. -O will have no effect for statically linked binaries, as you would have when compiling for AVRs, and there is no -Os.

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: AVR compiler that does a good job optimizing for size?
« Reply #23 on: May 07, 2015, 11:46:44 pm »
Quote
with any compiler, the best optimiser is the  user - you can do way more by carefully considering how you write the code than any compiler can ever hope to optimise well.

that's a nice theory, but once the optimizer gets "too smart", it can start adding extra code that you never wrote, for unclear reasons.
In fact, that was another one of RalphD's discoveries:  http://nerdralph.blogspot.com/2015/03/fastest-avr-software-spi-in-west.html
(discussed some more here: http://www.avrfreaks.net/forum/avr-gcc-creating-loop-counters-out-nothing-no-reason )
(tl/dr summary: gcc creates and manipulates a 16bit loop counter for the supposed-to-be clever "for(bit = 0x80; bit; bit >>= 1) {" loop.  Really annoying!)

Presumably the root cause of many of these issues is that the compiler does it optimizations at a level of abstraction makes assumptions about "costs" that are untrue for the 8bit AVR architecture. (which is, after all, pretty far from what gcc usually supports.)
 

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13746
  • Country: gb
    • Mike's Electric Stuff
Re: AVR compiler that does a good job optimizing for size?
« Reply #24 on: May 07, 2015, 11:58:02 pm »
Quote
with any compiler, the best optimiser is the  user - you can do way more by carefully considering how you write the code than any compiler can ever hope to optimise well.

that's a nice theory, but once the optimizer gets "too smart", it can start adding extra code that you never wrote, for unclear reasons.
In fact, that was another one of RalphD's discoveries:  http://nerdralph.blogspot.com/2015/03/fastest-avr-software-spi-in-west.html
(discussed some more here: http://www.avrfreaks.net/forum/avr-gcc-creating-loop-counters-out-nothing-no-reason )
(tl/dr summary: gcc creates and manipulates a 16bit loop counter for the supposed-to-be clever "for(bit = 0x80; bit; bit >>= 1) {" loop.  Really annoying!)

Presumably the root cause of many of these issues is that the compiler does it optimizations at a level of abstraction makes assumptions about "costs" that are untrue for the 8bit AVR architecture. (which is, after all, pretty far from what gcc usually supports.)
Exactly. And it isn't always the compiler - there are some things that arise from differences between the C language syntax and the CPU instruction set itself. e.g. if I do a<<=1;  the CPU generates a carry bit, but C can't make any use of it.
MCU-specific compilers like IAR are  likely to do a much better job than generic ones like GCC as they will be designed to have better visibility of the underlying instruction set. 
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #25 on: May 09, 2015, 10:21:58 pm »
In fact, that was another one of RalphD's discoveries:  http://nerdralph.blogspot.com/2015/03/fastest-avr-software-spi-in-west.html
(discussed some more here: http://www.avrfreaks.net/forum/avr-gcc-creating-loop-counters-out-nothing-no-reason )
(tl/dr summary: gcc creates and manipulates a 16bit loop counter for the supposed-to-be clever "for(bit = 0x80; bit; bit >>= 1) {" loop.  Really annoying!)

I've also found gcc does dumb things like creating a stack frame to save a register when a single push will do:
Code: [Select]
  b8:   cf 93           push    r28
  ba:   df 93           push    r29
  bc:   1f 92           push    r1
  be:   cd b7           in      r28, 0x3d       ; 61
  c0:   de b7           in      r29, 0x3e       ; 62
  c2:   c2 98           cbi     0x18, 2 ; 24
  c4:   69 83           std     Y+1, r22        ; 0x01
  c6:   d4 df           rcall   .-88            ; 0x70 <spi_byte.1455>
  c8:   69 81           ldd     r22, Y+1        ; 0x01
  ca:   86 2f           mov     r24, r22
  cc:   d1 df           rcall   .-94            ; 0x70 <spi_byte.1455>
  ce:   c2 9a           sbi     0x18, 2 ; 24
  d0:   0f 90           pop     r0
  d2:   df 91           pop     r29
  d4:   cf 91           pop     r28
  d6:   08 95           ret

Once I have a chance to strip the source code down to a clean example and show what the compiler should do (push r22/pop r24) I'll report the bug.

Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #26 on: May 09, 2015, 10:28:19 pm »
MCU-specific compilers like IAR are  likely to do a much better job than generic ones like GCC as they will be designed to have better visibility of the underlying instruction set.
s/likely/should/
In reality it doesn't, based on some posts I've read on avrfreaks from people who shared the disassembly from IAR of source I and others had compiled with gcc.
One area gcc does a good job (independent of target architecture) with is compile time constant evaluation - i.e. when you have a function that can be evaluated at compile time, gcc will substitute the evaluation.  With lto, it even does it across .o's.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #27 on: May 09, 2015, 10:31:11 pm »
In fact, that was another one of RalphD's discoveries:  http://nerdralph.blogspot.com/2015/03/fastest-avr-software-spi-in-west.html
(discussed some more here: http://www.avrfreaks.net/forum/avr-gcc-creating-loop-counters-out-nothing-no-reason )
(tl/dr summary: gcc creates and manipulates a 16bit loop counter for the supposed-to-be clever "for(bit = 0x80; bit; bit >>= 1) {" loop.  Really annoying!)

I've also found gcc does dumb things like creating a stack frame to save a register when a single push will do:
Code: [Select]
  b8:   cf 93           push    r28
  ba:   df 93           push    r29
  bc:   1f 92           push    r1
  be:   cd b7           in      r28, 0x3d       ; 61
  c0:   de b7           in      r29, 0x3e       ; 62
  c2:   c2 98           cbi     0x18, 2 ; 24
  c4:   69 83           std     Y+1, r22        ; 0x01
  c6:   d4 df           rcall   .-88            ; 0x70 <spi_byte.1455>
  c8:   69 81           ldd     r22, Y+1        ; 0x01
  ca:   86 2f           mov     r24, r22
  cc:   d1 df           rcall   .-94            ; 0x70 <spi_byte.1455>
  ce:   c2 9a           sbi     0x18, 2 ; 24
  d0:   0f 90           pop     r0
  d2:   df 91           pop     r29
  d4:   cf 91           pop     r28
  d6:   08 95           ret

Once I have a chance to strip the source code down to a clean example and show what the compiler should do (push r22/pop r24) I'll report the bug.

And the above bug exists even in 5.1 with ipa-ra optimization enabled.  The called function (spi_byte) doesn't even use r22, so with ipa-ra there should only be a mov r24, r22 between the first and 2nd calls to spi_byte.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline TerminalJack505

  • Super Contributor
  • ***
  • Posts: 1310
  • Country: 00
Re: AVR compiler that does a good job optimizing for size?
« Reply #28 on: May 09, 2015, 11:33:46 pm »
MCU-specific compilers like IAR are  likely to do a much better job than generic ones like GCC as they will be designed to have better visibility of the underlying instruction set.
s/likely/should/
In reality it doesn't, based on some posts I've read on avrfreaks from people who shared the disassembly from IAR of source I and others had compiled with gcc.
One area gcc does a good job (independent of target architecture) with is compile time constant evaluation - i.e. when you have a function that can be evaluated at compile time, gcc will substitute the evaluation.  With lto, it even does it across .o's.

I would be very surprised if the IAR AVR compiler doesn't do a better job than the GCC AVR compiler.  I've never tried it, however, so I can't speak from experience.

We use MSP430s at work and I compiled the same same project with IAR, TI's Optimizing C/C++ compiler and GCC.  IAR blew the other compilers out of the water.  The TI code was 40% larger than the IAR code.  The GCC code was 50% larger.

We have a very experienced assembly programmer on-staff and even he admitted that, under most circumstance, he couldn't produce more efficient code than the IAR compiler.
 

Online T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21681
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: AVR compiler that does a good job optimizing for size?
« Reply #29 on: May 10, 2015, 02:27:45 am »
So there you have it... if you want "real" results, use a "real" architecture, not some kid's toy... :P

MSP430 are well known in the industry.  Or you can just go with pretty much anyone's ARM, buy a few k more Flash and forget about optimization or code size...

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11631
  • Country: my
  • reassessing directives...
Re: AVR compiler that does a good job optimizing for size?
« Reply #30 on: May 10, 2015, 04:16:22 am »
buy a few k more Flash and forget about optimization or code size...
+1, no need to prove myself is a superman. even if i can become one, time wasted will be unrecoverable.
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8270
Re: AVR compiler that does a good job optimizing for size?
« Reply #31 on: May 10, 2015, 05:22:17 am »
buy a few k more Flash and forget about optimization or code size...
+1, no need to prove myself is a superman. even if i can become one, time wasted will be unrecoverable.
Not feasible in case of high-volume product. E.g. with 10M units, saving even $0.01 per unit will save $100,000 in total... and spending a few days' time to squeeze everything in is more than worth it.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: AVR compiler that does a good job optimizing for size?
« Reply #32 on: May 10, 2015, 06:48:05 am »
Quote
if you want "real" results, use a "real" architecture, not some kid's toy...
Really?  You're going to retroactively condemn an architecture because you don't like a "toy" that was built using it?  (I assume this is more "arduino bashing?")

I'll agree with Ralph.  I don't demand that a compiler produce code that it as good as I could do in assembler, but I find it really frustrating when it produces "obviously silly" code.  I guess most of the optimization takes place before the code is really avr-specific, so sign-extension followed by overwriting the high byte doesn't get compressed as it could be.  Sigh.  I would think some peephole optimizer should come along later and notice that the same register is being overwritten twice in a row, but perhaps that just isn't done anymore.

BTW, here's what I get for an ARM thumb compile:
Code: [Select]
00000000 <u8tohex>:
   0: 0903      lsrs r3, r0, #4
   2: 2b09      cmp r3, #9
   4: d900      bls.n 8 <u8tohex+0x8>
   6: 3311      adds r3, #17
   8: 22c0      movs r2, #192 ; 0xc0
   a: 0192      lsls r2, r2, #6
   c: 021b      lsls r3, r3, #8
   e: 189b      adds r3, r3, r2
  10: 220f      movs r2, #15
  12: 4010      ands r0, r2
  14: 2809      cmp r0, #9
  16: d900      bls.n 1a <u8tohex+0x1a>
  18: 3011      adds r0, #17
  1a: 3030      adds r0, #48 ; 0x30
  1c: 4318      orrs r0, r3
  1e: 0400      lsls r0, r0, #16
  20: 0c00      lsrs r0, r0, #16
  22: 4770      bx lr

Love that lsls/lsrs short-to-int conversion!
 

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13746
  • Country: gb
    • Mike's Electric Stuff
Re: AVR compiler that does a good job optimizing for size?
« Reply #33 on: May 10, 2015, 09:09:04 am »

I'll agree with Ralph.  I don't demand that a compiler produce code that it as good as I could do in assembler, but I find it really frustrating when it produces "obviously silly" code.  I guess most of the optimization takes place before the code is really avr-specific, so sign-extension followed by overwriting the high byte doesn't get compressed as it could be.  Sigh.  I would think some peephole optimizer should come along later and notice that the same register is being overwritten twice in a row, but perhaps that just isn't done anymore.

Considering that the AVR was probably the first MCU architecture that was specifically designd to be compiler friendly you'd think it wouldn't be too hard do a good job.
A general purpose compiler like GCC is likely to have code size lower down on its list of priorities compared to an MCU specific one like IAR.

Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11631
  • Country: my
  • reassessing directives...
Re: AVR compiler that does a good job optimizing for size?
« Reply #34 on: May 10, 2015, 11:57:45 am »
buy a few k more Flash and forget about optimization or code size...
+1, no need to prove myself is a superman. even if i can become one, time wasted will be unrecoverable.
Not feasible in case of high-volume product. E.g. with 10M units, saving even $0.01 per unit will save $100,000 in total... and spending a few days' time to squeeze everything in is more than worth it.
make your product $0.01 more expensive, problem solved. squeezing codes (algorithm) for a few days is sensible, finding a "supersmall code generating" compiler that cost $1000 subscription a year is not, costly-wise. but really... if you can sell 10M units WTH even $10K subscription doesnt hurt, making product $0.01 cheaper doesnt hurt, time is gold.
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline Muxr

  • Super Contributor
  • ***
  • Posts: 1369
  • Country: us
Re: AVR compiler that does a good job optimizing for size?
« Reply #35 on: May 10, 2015, 05:43:16 pm »
buy a few k more Flash and forget about optimization or code size...
+1, no need to prove myself is a superman. even if i can become one, time wasted will be unrecoverable.
Not feasible in case of high-volume product. E.g. with 10M units, saving even $0.01 per unit will save $100,000 in total... and spending a few days' time to squeeze everything in is more than worth it.
make your product $0.01 more expensive, problem solved. squeezing codes (algorithm) for a few days is sensible, finding a "supersmall code generating" compiler that cost $1000 subscription a year is not, costly-wise. but really... if you can sell 10M units WTH even $10K subscription doesnt hurt, making product $0.01 cheaper doesnt hurt, time is gold.
It's all a tradeoff. I mean if you really don't want to be at the whim of a compiler and you need to squeeze every bit of resources you have, write your code in assembly.
 

Offline bson

  • Supporter
  • ****
  • Posts: 2270
  • Country: us
Re: AVR compiler that does a good job optimizing for size?
« Reply #36 on: May 10, 2015, 11:34:05 pm »
gcc is very good at high-level "pro forma" elimination.  Take this example for ARM:

Code: [Select]
// Return min, max for two operands of any type T that implements operator<()

template<typename T>
T max(const T& a, const T& b) {
    return b < a ? a : b;
}

template<typename T>
T min(const T& a, const T& b) {
    return b < a ? b : a;
}

int
foo(const char arg) {
    char a = 1;
    char b = 2;
    char c = 3;

    // d = b
    const char d = min(max(a, b), c);
    return min(d, arg);
}

Including the function and module wrappers, it generates exactly this for ARM7T (g++ 4.6.0):

Code: [Select]
   .cpu arm7tdmi
    .fpu softvfp
    .eabi_attribute 20, 1
    .eabi_attribute 21, 1
    .eabi_attribute 23, 3
    .eabi_attribute 24, 1
    .eabi_attribute 25, 1
    .eabi_attribute 26, 1
    .eabi_attribute 30, 2
    .eabi_attribute 18, 4
    .file   "foo.cxx"
    .text
    .align  2
    .global _Z3fooc
    .type   _Z3fooc, %function
_Z3fooc:
    .fnstart
.LFB2:
    @ Function supports interworking.
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    cmp r0, #1
    movhi   r0, #2
    bx  lr
    .cantunwind
    .fnend
    .size   _Z3fooc, .-_Z3fooc
    .ident  "GCC: (GNU) 4.6.0"

Basically, R0 is the input and it gets conditionally changed to 2 before it returns via LR.  The function is exactly three instructions, indicating the compiler has completely reduced the constant expressions including even template functions with an inferred type parameter.  For large software projects this is the truly golden stuff.  Small checksumming loops and such are easily hand rolled in inline assembly (which gcc will also nicely inline as it's declarative on what side effects it has).  This is even an old compiler.  Gcc is an excellent compiler for large software projects, but it has never been good for small 8/16 bitters.  People spent a ridiculous amount of effort trying to use it with 8086, to no avail.  It wants a 32/64 bit processor with lots of GPRs, a flat address space, and orthogonal instruction architecture.
 

Online T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21681
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: AVR compiler that does a good job optimizing for size?
« Reply #37 on: May 11, 2015, 01:40:11 am »
Quote
if you want "real" results, use a "real" architecture, not some kid's toy...
Really?  You're going to retroactively condemn an architecture because you don't like a "toy" that was built using it?  (I assume this is more "arduino bashing?")

Maybe a little, but more generally.  I get the impression AVR never really gained a huge market share (compared to cheap and quirky PICs, or for that matter, 8051, the old standby), and therefore never had quite as much development towards it.

The XMEGAs are nicely featured (and actually specify most parameters, instead of having poorly spec'd and shitty performing peripherals like the "10 bit" (I use the term loosely) ADC, the "voltage reference" that's little better than a resistor, and etc.), but they took a while to enter the market, too.  It's probably quite appropriate that a "toy" (i.e., Arduino) got developed from it -- it is a pretty easy platform to get started with.  But for the same reason, it's a bit too thin and quirky to use as much professionally.

What's weird is I rather enjoyed the MEGA documentation, and the XMEGA documentation is fragmentary and opaque.  Sausage effect?

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #38 on: May 11, 2015, 02:08:42 am »
I get the impression AVR never really gained a huge market share (compared to cheap and quirky PICs, or for that matter, 8051, the old standby), and therefore never had quite as much development towards it.

I'm sure with enough digging someone could find sales numbers on AVR vs PIC & 8051 MCUs.   Anecdotally, I took apart a CO2/temp/humidity sensor and found it had a ATmega32 with a FTDI chip for interfacing to USB.  I also know a guy into amateur racing who has a dashboard that uses 3 AVRs for the controllers.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #39 on: May 11, 2015, 02:45:54 am »
BTW, here's what I get for an ARM thumb compile:
Code: [Select]
00000000 <u8tohex>:
   0: 0903      lsrs r3, r0, #4
   2: 2b09      cmp r3, #9
   4: d900      bls.n 8 <u8tohex+0x8>
   6: 3311      adds r3, #17
   8: 22c0      movs r2, #192 ; 0xc0
   a: 0192      lsls r2, r2, #6
   c: 021b      lsls r3, r3, #8
   e: 189b      adds r3, r3, r2
  10: 220f      movs r2, #15
  12: 4010      ands r0, r2
  14: 2809      cmp r0, #9
  16: d900      bls.n 1a <u8tohex+0x1a>
  18: 3011      adds r0, #17
  1a: 3030      adds r0, #48 ; 0x30
  1c: 4318      orrs r0, r3
  1e: 0400      lsls r0, r0, #16
  20: 0c00      lsrs r0, r0, #16
  22: 4770      bx lr

Love that lsls/lsrs short-to-int conversion!

Thanks.  I've considered trying out ARM MCUs (STM32 or LPC), but haven't even installed a compiler yet.  I had a bit of difficulty following the code, and at first I though because I'm green on ARM assembler.  After checking the instruction set and finding out I correctly interpreted the instructions, I realize gcc is doing some pretty dumb stuff there too.  What had me confused the most was this:
Code: [Select]
   8: 22c0      movs r2, #192 ; 0xc0
   a: 0192      lsls r2, r2, #6
   c: 021b      lsls r3, r3, #8
   e: 189b      adds r3, r3, r2
What would have made sense to me is:
Code: [Select]
adds r3, #48
lsls r3, r3, #8

I had expected that given the ARM being a more popular architecture than the AVR, there would be more people working on the arm-gcc back-end, and therefore it would generate more optimal code than the AVR.  That, combined with the more powerful instruction set, means the generated code should be much more compact than for the AVR.  Instead, the generated code is only 2 bytes shorter than the AVR version.  And much like the AVR version, it was pretty easy to slim down the code by ~25%:
Code: [Select]
  lsrs r3, r0, #4
  cmp r3, #9
  bls.n +2
  adds r3, #17
  adds r3, #48
  lsls r3, r3, #8
  movs r2, #15
  ands r0, r2
  cmp r0, #9
  bls.n +2
  adds r0, #17
  adds r0, #48
  orrs r0, r3
  bx lr
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline ralphdTopic starter

  • Frequent Contributor
  • **
  • Posts: 445
  • Country: ca
    • Nerd Ralph
Re: AVR compiler that does a good job optimizing for size?
« Reply #40 on: May 12, 2015, 01:00:32 am »
Decided to download the IAR eval version.  I didnt get very far. The registration and licence manager is a pain compared to other eval software Ive tried.  Once I found out it doesnt use avr-libc I just deleted it.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13746
  • Country: gb
    • Mike's Electric Stuff
Re: AVR compiler that does a good job optimizing for size?
« Reply #41 on: May 12, 2015, 09:07:01 am »
Decided to download the IAR eval version.  I didnt get very far. The registration and licence manager is a pain compared to other eval software Ive tried.  Once I found out it doesnt use avr-libc I just deleted it.
Before I moved to mostly using PICs, I used IAR a lot for AVR and ARM - it's an excellent compiler and IDE, with lots of features I miss using MPLAB and XC for PICs.
They also supply a huge number of ready-to-run examples, which helps enormously when getting up & running.
Probably my biggest complaint is they seemed to change major versions quite often - I must have about half a dozen versions installed for various legacy projects that I can't be bothered to update. It's often fairly minor stuff, it's just not been worth the time to investigate updating for the later versions.
Oh, and the big cost step to go from the free to full version - AVR isn't too bad but ARM is a big cost jump (not looked for a while of they've added any intermediate options for ARM)
 
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf