For instance, -fno-jump-tables will reduce my current v1.50m build from 32072 to 32016 bytes.
And with a combination of CFLAGS += -ffunction-sections and LDFLAGS += -Wl,--gc-sections I was able to reduce that even further down to 31892 bytes.
In my case it changes to the following if I add all these 3 options:
Trying system's avr-gcc [avr-gcc (GCC) 5.4.0] ... 28330 bytes
Trying avr-gcc-10.1.0-x64-linux [avr-gcc (GCC) 10.1.0] ... 30706 bytes
Trying avr-gcc-11.1.0-x64-linux [avr-gcc (GCC) 11.1.0] ... 30574 bytes
Trying avr-gcc-12.1.0-x64-linux [avr-gcc (GCC) 12.1.0] ... 30984 bytes
Trying avr-gcc-13.2.0-fc [avr-gcc (Fedora 13.2.0-1.fc38) 13.2.0] ... 28720 bytes
Trying avr-gcc-7.3.0-arduino [avr-gcc (GCC) 7.3.0] ... 28428 bytes
Trying avr-gcc-7.3.0-x64-linux [avr-gcc (GCC) 7.3.0] ... 28430 bytes
Trying avr-gcc-8.1.0-x64-linux [avr-gcc (GCC) 8.1.0] ... 28522 bytes
Trying avr-gcc-8.2.0-x64-linux [avr-gcc (GCC) 8.2.0] ... 28522 bytes
Trying avr-gcc-8.3.0-x64-linux [avr-gcc (GCC) 8.3.0] ... 28530 bytes
Trying avr-gcc-9.1.0-x64-linux [avr-gcc (GCC) 9.1.0] ... 30494 bytes
Trying avr-gcc-9.2.0-x64-linux [avr-gcc (GCC) 9.2.0] ... 30486 bytes
Trying avr8-gnu-toolchain-linux_x86_64-microchip [avr-gcc (AVR_8_bit_GNU_Toolchain_3.7.0_1796) 7.3.0] ... 28428 bytes
And I have the same result if I don't use CFLAGS += -ffunction-sections and LDFLAGS += -Wl,--gc-sections.
gcc 5.4.0 produces the smallest fw again. In my specific case.
My options are the following at the moment:
# compiler flags
CC = avr-gcc
CPP = avr-g++
CFLAGS = -mmcu=${MCU} -Wall -I. -Ibitmaps
CFLAGS += -DF_CPU=${FREQ}000000UL
CFLAGS += -DOSC_STARTUP=${OSC_STARTUP}
CFLAGS += -gdwarf-2 -std=gnu99 -Os -mcall-prologues
CFLAGS += -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums
CFLAGS += -flto -fno-jump-tables
CFLAGS += -MD -MP -MT $(*F).o -MF dep/$(@F).d
# linker flags
LDFLAGS = -g -mmcu=${MCU} -Wl,--relax,-Map=${NAME}.map
PS. After reading that article you suggested I have added more options which allowed to save a little more space: -fno-inline-small-functions -fno-split-wide-types -fno-move-loop-invariants -fno-tree-loop-optimize -mstrict-X
And result is the following:
Trying system's avr-gcc [avr-gcc (GCC) 5.4.0] ... 28072 bytes
Trying avr-gcc-10.1.0-x64-linux [avr-gcc (GCC) 10.1.0] ... 30176 bytes
Trying avr-gcc-11.1.0-x64-linux [avr-gcc (GCC) 11.1.0] ... 30188 bytes
Trying avr-gcc-12.1.0-x64-linux [avr-gcc (GCC) 12.1.0] ... 30486 bytes
Trying avr-gcc-13.2.0-fc [avr-gcc (Fedora 13.2.0-1.fc38) 13.2.0] ... 28250 bytes
Trying avr-gcc-7.3.0-arduino [avr-gcc (GCC) 7.3.0] ... 28226 bytes
Trying avr-gcc-7.3.0-x64-linux [avr-gcc (GCC) 7.3.0] ... 28228 bytes
Trying avr-gcc-8.1.0-x64-linux [avr-gcc (GCC) 8.1.0] ... 28330 bytes
Trying avr-gcc-8.2.0-x64-linux [avr-gcc (GCC) 8.2.0] ... 28330 bytes
Trying avr-gcc-8.3.0-x64-linux [avr-gcc (GCC) 8.3.0] ... 28350 bytes
Trying avr-gcc-9.1.0-x64-linux [avr-gcc (GCC) 9.1.0] ... 30150 bytes
Trying avr-gcc-9.2.0-x64-linux [avr-gcc (GCC) 9.2.0] ... 30142 bytes
Trying avr8-gnu-toolchain-linux_x86_64-microchip [avr-gcc (AVR_8_bit_GNU_Toolchain_3.7.0_1796) 7.3.0] ... 28226 bytes
Options:
# compiler flags
CC = avr-gcc
CPP = avr-g++
CFLAGS = -mmcu=${MCU} -Wall -I. -Ibitmaps
CFLAGS += -DF_CPU=${FREQ}000000UL
CFLAGS += -DOSC_STARTUP=${OSC_STARTUP}
CFLAGS += -gdwarf-2 -std=gnu99 -Os -mcall-prologues -fno-inline-small-functions -fno-split-wide-types -fno-move-loop-invariants -fno-tree-loop-optimize -mstrict-X
CFLAGS += -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums
CFLAGS += -flto -fno-jump-tables
CFLAGS += -MD -MP -MT $(*F).o -MF dep/$(@F).d
# linker flags
LDFLAGS = -g -mmcu=${MCU} -Wl,--relax,-Map=${NAME}.map
avr-gcc 5.4.0 is the winner again.
And again - it is my particular code configuration. It may happen that for other code configuration another avr-gcc version will win.