| Products > Embedded Computing |
| [solved] AVR-GCC generates ASM stuffed with bloatware use of reserved registers |
| (1/7) > >> |
| RoGeorge:
Using Eclipse with avr-gcc (GCC) 5.4.0 and these compiling parameters (copied from Eclipse project settings): -Wall -g2 -gdwarf-2 -O0 -fpack-struct -fshort-enums -ffunction-sections -fdata-sections -std=gnu99 -funsigned-char -funsigned-bitfields -mmcu=attiny13 -DF_CPU=9600000UL I was trying to poke bit 4 in PORTB to 0 then 1 at max possible speed. No optimization -O0 is intentional. The MCU is an ATtiny13. Whatever C syntax I try for flipping the PB4 bit, the compiler generates a lot of extra assembly code at each C line. What's more intriguing is that this extra ASM instructions are handling some registers that are marked as reserved in the ATtiny13 documentation (for example reserved r24 and r25 - see page 7 of 22 in https://ww1.microchip.com/downloads/en/DeviceDoc/2535S.pdf ). The PORTB is r18. For example (from the compiled .lss): --- Code: --- PORTB |= _BV(PB4); // set PB4 LED on 98: 88 e3 ldi r24, 0x38 ; 56 9a: 90 e0 ldi r25, 0x00 ; 0 9c: 28 e3 ldi r18, 0x38 ; 56 9e: 30 e0 ldi r19, 0x00 ; 0 a0: f9 01 movw r30, r18 a2: 20 81 ld r18, Z a4: 20 61 ori r18, 0x10 ; 16 a6: fc 01 movw r30, r24 a8: 20 83 st Z, r18 PORTB &= ~_BV(PB4); // PB4 LED off aa: 88 e3 ldi r24, 0x38 ; 56 ac: 90 e0 ldi r25, 0x00 ; 0 ae: 28 e3 ldi r18, 0x38 ; 56 b0: 30 e0 ldi r19, 0x00 ; 0 b2: f9 01 movw r30, r18 b4: 20 81 ld r18, Z b6: 2f 7e andi r18, 0xEF ; 239 b8: fc 01 movw r30, r24 ba: 20 83 st Z, r18 --- End code --- No matter what C code I was trying for poking the PB4 bit, the compiler never made use of the CBI/SBI assembler instructions: https://onlinedocs.microchip.com/pr/GUID-0B644D8F-67E7-49E6-82C9-1B2B9ABE6A0D-en-US-19/index.html?GUID-DC827DBD-2D0E-4697-83A9-047661370309 By looking at the dissambled binary loaded into the MCU, all those reserved registers are indeed loaded with some numbers and handled apparently for no reason. :-// The code is a mess, adding it here only to spare the time of unziping the attached project. The commented out C lines are other tries that also failed to produce the expected ASM instructions CBI/SBI register, bit. --- Code: ---// #define F_CPU (9600000UL) // #define __AVR_ATtiny13__ __AVR_ATtiny13__ #include <avr/io.h> //#include <avr/builtins.h> #include <avr/iotn13.h> #include <avr/interrupt.h> #include <avr/sleep.h> // #include <util/delay.h> #define DELAY 200U #define MAXCOUNT 16 static volatile int counter; void WDT_off(void) { // cli(); //__watchdog_reset(); /* Clear WDRF in MCUSR */ MCUSR &= ~(1<<WDRF); /* Write logical one to WDCE and WDE */ /* Keep old prescaler setting to prevent unintentional time-out */ WDTCR |= (1<<WDCE) | (1<<WDE); /* Turn off WDT */ WDTCR = 0x00; // __enable_interrupt(); } ISR(WDT_vect) { //blink } ISR(ANA_COMP_vect, ISR_BLOCK) { //blink //if (ACSR & _BV(ACO)) /* if (ACSR & (1<<ACO)) PORTB |= _BV(PB4); // set PB4 LED on else PORTB &= ~_BV(PB4); // PB4 LED off */ /* counter++; if (counter >= MAXCOUNT) { PORTB ^= _BV(PB4); //toggle PB4 counter = 0; } */ } int main(void) { cli(); WDT_off(); DDRB |= _BV(PB4); //PB4 as Output // ACSR – Analog Comparator Control and Status Register (default all 0) // Bit 7 – ACD: Analog Comparator Disable // Bit 6 – ACBG: Analog Comparator Bandgap Select // Bit 5 – ACO: Analog Comparator Output (read only) // Bit 4 – ACI: Analog Comparator Interrupt Flag // Bit 3 – ACIE: Analog Comparator Interrupt Enable // Bit 2 – Res: Reserved Bit // Bits 1, 0 – ACIS1, ACIS0: Analog Comparator Interrupt Mode Select // 00 - interrupt on toggle // 01 - reserved // 10 - interrupt on falling edge // 11 - interrupt on raising edge ACSR |= _BV(ACIE); // enable Analog Comparator interrupts // disable watchdog timer // ? clear AC int flag (ACI) // enable AC interrupts // enable global interrupts //sei(); cli(); //sleep_mode(); // forever loop for(;;) { //sleep_enable(); PORTB |= _BV(PB4); // set PB4 LED on PORTB &= ~_BV(PB4); // PB4 LED off PORTB |= _BV(PB4); // set PB4 LED on PORTB &= ~_BV(PB4); // PB4 LED off PORTB |= _BV(PB4); // set PB4 LED on PORTB &= ~_BV(PB4); // PB4 LED off /////////////////// struct bits { uint8_t b0:1; uint8_t b1:1; uint8_t b2:1; uint8_t b3:1; uint8_t b4:1; uint8_t b5:1; uint8_t b6:1; uint8_t b7:1; } __attribute__((__packed__)); #define SBIT(port,pin) ((*(volatile struct bits*)&port).b##pin) //bit defines #define WR SBIT(PORTB,4) #define WR_DDR SBIT(DDRB,4) //usage //WR_DDR = 1; WR = 1; WR = 0; //will result in sbi/cbi WR = 1; WR = 0; WR = 1; WR = 0; /////////////////// // asm volatile ( // "SBI r18, 4" "\n\t"); // "CBI(PORTB, 4);"); /* SBI(PORTB, 4); CBI(PORTB, 4); SBI(PORTB, 4); CBI(PORTB, 4); */ /////////////////// /* PORTB |= (1 << PB4); // set pin 4 of Port B high PORTB &= ~(1 << PB4); // set pin 4 of Port B low PORTB |= (1 << PORTB3); // set pin 4 high again PORTB &= ~(1 << PORTB3); // set pin 4 high again PORTB ^= (1 << PB4); // set pin 4 high again PORTB ^= (1 << PORTB4); // set pin 4 high again */ /////////////////// /* bit_is_set(PORTB, 4); bit_is_clear(PORTB, 4); bit_is_set(PORTB, 4); bit_is_clear(PORTB, 4); bit_is_set(PORTB, 4); bit_is_clear(PORTB, 4); */ } } --- End code --- -O0 compiling option was chosen so because I want no C line mangled-out at optimization. I need later to step through the C code line-by-line, with a hardware debugger. I was expecting to see something like this in assembler, corresponding to set/reset of PORTB bit4: --- Code: ---sbi r18, 4 cbi r18, 4 ... --- End code --- Any chances to make the compiler produce something like that when poking at port bits? And why is the compiler using those reserved registers r24 and r25? |
| jfiresto:
You may have to live with disappointment. The peak performance of avr-gcc code generation was somewhere around 3.4.6 – since then it has gotten dispiritingly bad. |
| RoGeorge:
Turns out the bloatware instructions happens only when forcing the no optimization flag -O0. Any other optimization level will generate the expected SBI/CBI assembler instructions for set/clear bit, and also won't make use of the reserved register. I guess I'll have to give up to the idea of no code optimization. |
| coppice:
--- Quote from: jfiresto on July 19, 2024, 07:01:50 pm ---You may have to live with disappointment. The peak performance of avr-gcc code generation was somewhere around 3.4.6 – since then it has gotten dispiritingly bad. --- End quote --- Yep. AVR, MSP430 and I believe any other simple core's generated code became much worse from GCC 4 onwards. As GCC focussed more and more on complex cores, it lost the plot for simple cores. I was involved in GCC for the MSP430. After being pretty content working with various revisions of GCC 3, we couldn't find any way to coax better results out of GCC 4 and later. GCC won't build on modern systems, producing a mass of errors. I wonder how much effort it would take to make it build again? I am not sure if you just need this older GCC, or if older versions of GDB or binutils are also needed for compatibility. |
| djacobow:
You might want to try -Og --- Quote ---Optimize debugging experience. -Og should be the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience. It is a better choice than -O0 for producing debuggable code because some compiler passes that collect debug information are disabled at -O0. Like -O0, -Og completely disables a number of optimization passes so that individual options controlling them have no effect. Otherwise -Og enables all -O1 optimization flags except for those that may interfere with debugging: -fbranch-count-reg -fdelayed-branch -fdse -fif-conversion -fif-conversion2 -finline-functions-called-once -fmove-loop-invariants -fmove-loop-stores -fssa-phiopt -ftree-bit-ccp -ftree-dse -ftree-pta -ftree-sra --- End quote --- Personally, I don't think I have ever run avr-gcc without -Os since size is by far my biggest concern most of the time. |
| Navigation |
| Message Index |
| Next page |