Author Topic: GCC ARM32 compiler too clever, or not clever enough?  (Read 13923 times)

0 Members and 1 Guest are viewing this topic.

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4039
  • Country: nz
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #125 on: November 04, 2022, 11:35:48 pm »
I find it unfortunate that we have to put functions in individual sections (which makes the object files bigger, not that it is a huge deal, but yeah) to get this behavior. I can't really find a rationale for not making it the default behavior of the linker.

The tradition in *nix is that the object file formats and the linker have no concept at all of body of a function, but only entry points. A logical function can jump around arbitrarily within the section, can have multiple entry points, and so forth.

Consider this:

Code: [Select]
00000000000101e8 <isEven>:
   101e8:       c119                    beqz    a0,101ee <isEven+0x6>
   101ea:       357d                    addiw   a0,a0,-1
   101ec:       a019                    j       101f2 <isOdd>
   101ee:       4505                    li      a0,1
   101f0:       8082                    ret

00000000000101f2 <isOdd>:
   101f2:       c119                    beqz    a0,101f8 <isOdd+0x6>
   101f4:       357d                    addiw   a0,a0,-1
   101f6:       bfcd                    j       101e8 <isEven>
   101f8:       8082                    ret

Is that two functions, or one function with two entry points?

Source, compiled with just -Os...

Code: [Select]
typedef unsigned int uint;

int isOdd(uint n);
int isEven(uint n);

int isOdd(uint n) {
  return n == 0 ? 0 : isEven(n - 1);;
}

int isEven(uint n) {
  return n == 0 ? 1 : isOdd(n - 1);;
}
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6264
  • Country: fi
    • My home page and email address
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #126 on: November 04, 2022, 11:58:15 pm »
One question I have (I admit I don't use these options very often actually) is that, what happens if some function is never called directly in the code, but passed as a function pointer somewhere and called indirectly? Is the linker clever enough not to prune this function in this case? I suppose that taking a pointer to it should be enough to determine that said function is not dead code, but just wondering.
Yes, the linker is clever enough not to prune a function which is only called via a function pointer.

This is because of how the linker actually does this.  Because each function is in their own section, it examines the symbols referred to in each section.  It does not matter if the function symbol is called, or the address of the function is taken, because both cause the function symbol to be added to the symbol table the same way.  Then, the linker creates a disjoint-set data structure of all sections, doing a Join between a pair of sections whenever a symbol in one section is used in another section.  The linker keeps all sections that belong to the same set as the ELF start address or start function is in, and discards all others.

(Note: I didn't actually check that this is exactly what the GNU linker is doing; this is based on the documentation and observable behaviour, and how I'd implement it.  Using a graph instead of a disjoint-set would give the exact same results, but would be less efficient.)

(Edited: I checked.  See binutils/ld/ldlang.c:lang_gc_sections() and binutils/ld/ldlang.c:lang_end(), as well as other code using link_info.gc_sections.  It uses hash lookups, implementing the logic above, but in a different manner.)

The tradition in *nix is that the object file formats and the linker have no concept at all of body of a function, but only entry points. A logical function can jump around arbitrarily within the section, can have multiple entry points, and so forth.
Yes, and the compiler can still generate the exact same code even with -ffunction-sections -Wl,--gc-sections, because the jump address in the ELF file is a symbol table reference, not a plain numeric address.
In other words, the two will always contain a symbolic reference to the other, so they're always either both included or both excluded.

The fact that the jump is to a different section does not matter here.  What isn't guaranteed, is that the two functions will reside at nearby addresses in the final binary.
« Last Edit: November 05, 2022, 12:14:41 am by Nominal Animal »
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4039
  • Country: nz
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #127 on: November 05, 2022, 12:32:47 am »
The tradition in *nix is that the object file formats and the linker have no concept at all of body of a function, but only entry points. A logical function can jump around arbitrarily within the section, can have multiple entry points, and so forth.
Yes, and the compiler can still generate the exact same code even with -ffunction-sections -Wl,--gc-sections, because the jump address in the ELF file is a symbol table reference, not a plain numeric address.
In other words, the two will always contain a symbolic reference to the other, so they're always either both included or both excluded.

The fact that the jump is to a different section does not matter here.  What isn't guaranteed, is that the two functions will reside at nearby addresses in the final binary.

The point is not that sections can be put next to each other. The point is that a section can not be subdivided by the linker. If a compiler or programmer puts several logically distinct things in the same section then they are inextricably joined together forever and included or excluded as a whole.

This is different to how traditional Mac or Windows object file formats work, where every top level data item or function is always its own section.
 
The following users thanked this post: Nominal Animal

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6264
  • Country: fi
    • My home page and email address
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #128 on: November 05, 2022, 02:23:09 am »
The point is not that sections can be put next to each other. The point is that a section can not be subdivided by the linker. If a compiler or programmer puts several logically distinct things in the same section then they are inextricably joined together forever and included or excluded as a whole.

This is different to how traditional Mac or Windows object file formats work, where every top level data item or function is always its own section.
Right, I misunderstood your point.  I was just trying to say that while function-sections does increase the ELF object file size, it shouldn't –– technically, does not need to –– affect the generated code, or increase the final binary size, at all.  This is just how ELF format files can be used to do this thing.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14482
  • Country: fr
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #129 on: November 05, 2022, 02:43:44 am »
Yep I understand that. I just find the choice they made a bit odd, but that's not the sole oddity that I find in Unix stuff. Everything has a baggage. ;D
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #130 on: November 05, 2022, 07:04:39 am »
Does GCC v10 remove unused code at compile-time, as a default?
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11263
  • Country: us
    • Personal site
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #131 on: November 05, 2022, 07:16:38 am »
Depends on how specifically it is "unused". Things like
Code: [Select]
int foo()
{
  if (1)
    return 1;
  else
    return 123;
}
would be optimized away at any optimization level by default.

At -O1 unused static functions would be removed even without garbage collection, they won't even get to the linking stage.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #132 on: November 05, 2022, 08:21:45 am »
I am referring (see earlier post) to the vast amount of code in the ST "HAL" code which comes with ST Cube IDE.

For example you get a file containing a dozen "SPI" functions. Some polled, some _IT (interrupt), some _DMA (for DMA). Possibly none of these get called.

I can do a test of project size, with a function having #if 0 / # endif around it, and it seems these do get removed.

This is the sort of stuff which is 99% unused



Experimentally, if I exclude a .c file from the build, and it isn't used e.g. the ...i2c.c file, then the FLASH code size does not change. I wonder what rules control this.

I am using -Og. I am fairly sure -O0 does the same otherwise compiling with that (which I have tried) would massively bloat the project. It increases code size about 20-30%. But maybe that 20-30% is that the whole ST lib is in there?

However I have just compiled with -O0 and the size went up as expected from 351k to 502k. A quick look at some random unused function finds it in the .map file!



So, yeah, -O0 does not remove unreachable code!

And I have that checkbox ticked



but probably not the required linker option.
« Last Edit: November 05, 2022, 08:45:46 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #133 on: November 05, 2022, 12:00:35 pm »
(for me it's all wrong approach)
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #134 on: November 05, 2022, 12:04:17 pm »
Not everybody is as clever as you, especially not me :)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11263
  • Country: us
    • Personal site
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #135 on: November 05, 2022, 04:22:43 pm »
IRQ handlers and their dependencies are not "unused", Pointers to them are used in the vector table, so the compiler has to include them. Those things will not be removed under any optimization options, since the compiler does not know under which conditions those pointers are used.
Alex
 
The following users thanked this post: DiTBho

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #136 on: November 05, 2022, 04:58:06 pm »
If you exclude them you will get a linker error, which I was not getting.

My conclusion is that -O0 includes all sources loaded into the Cube editor structure.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11263
  • Country: us
    • Personal site
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #137 on: November 05, 2022, 05:10:38 pm »
Well, yes, -O0 would include all the code. As I said, the only things that may get optimized at -O0 is the dead code within the function itself and some expression folding might happen too.

That's the difference between optimization levels.

I really don't get what is your end goal here.
« Last Edit: November 05, 2022, 05:12:22 pm by ataradov »
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #138 on: November 05, 2022, 05:52:17 pm »
I just want to be sure that unwanted code is really being excluded.

The project was set up by someone else years ago, with a ton of libraries, and bits of these are used around the place, but probably less than 10% of them. And I don't want to go around putting

#if 0
#endif

around all that stuff.

I might also need more of it one day.

And while I do all my code "bar metal" (because I want to understand what I am doing) the HAL etc stuff is useful for reference, so I don't want to just delete the files.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11263
  • Country: us
    • Personal site
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #139 on: November 05, 2022, 05:56:13 pm »
Just enable optimization and compiler/linker will remove that for you, you will go nuts removing that stuff by hand.
Alex
 
The following users thanked this post: peter-h

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14482
  • Country: fr
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #140 on: November 05, 2022, 08:28:16 pm »
Does GCC v10 remove unused code at compile-time, as a default?

It does remove unused code that is statically (at compile-time) detectable as unused in a single compilation unit. For code inside a function, yes. As long as optimizations are enabled. For full functions, yes if they are defined static, no otherwise (which is why you need the above sections 'trick' to get rid of this kind of dead code.)

Most modern C compilers will act pretty much the same. MSVC may be a bit different as it uses a different object model and may be able to prune unused code across compilation units without requiring any specific options (as we explained earlier.)
« Last Edit: November 05, 2022, 08:30:21 pm by SiliconWizard »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #141 on: November 05, 2022, 08:58:42 pm »
I do have the "place functions in their own sections" enabled as shown above, but nothing else.

I am not concerned about unreachable code within a function; hopefully none of my code has that :) And anyway the compiler tends to warn about that.

I just want it to reliably remove entire unused functions; that removes the vast majority of the ST libraries.

Some functions are called via function tables. This is normal in USB and ETH code. But these will be correctly preserved because the function name (its address) is in that table.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14482
  • Country: fr
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #142 on: November 05, 2022, 09:03:27 pm »
Then the "place functions in their own sections" option will do the job. But I don't know how it translates as far as compile and link option go. If it only acts on the compiling options, that's not enough. The linker also needs to get the right option, and I don't know if the option you mention (which I'm assuming is a check box in some IDE) does activate both options.

You can additionally use the same kind of option for data, so that the linker will prune any data that is unused. Otherwise you may still have unused data in the final binary. I don't know how much "data" the ST libraries use and if that will make a difference, but that's worth a try.
« Last Edit: November 05, 2022, 09:15:00 pm by SiliconWizard »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #143 on: November 05, 2022, 09:14:59 pm »
There should not be much RAM usage because I have spent half my life over past 2 years checking that - the 32F417 doesn't have a whole load left by the time you have MbedTLS in there :)

Curiously, I have found there is a "master HAL module enable .h file" with this

Code: [Select]
/* ########################## Module Selection ############################## */
/**
  * @brief This is the list of modules to be used in the HAL driver
  */
#define HAL_MODULE_ENABLED 
#define HAL_ADC_MODULE_ENABLED
/* #define HAL_CAN_MODULE_ENABLED */
/* #define HAL_CAN_LEGACY_MODULE_ENABLED */ 
/* #define HAL_CRC_MODULE_ENABLED */ 
/* #define HAL_CRYP_MODULE_ENABLED */ 
#define HAL_DAC_MODULE_ENABLED
/* #define HAL_DCMI_MODULE_ENABLED */
#define HAL_DMA_MODULE_ENABLED
/* #define HAL_DMA2D_MODULE_ENABLED */
#define HAL_ETH_MODULE_ENABLED
#define HAL_FLASH_MODULE_ENABLED
/* #define HAL_NAND_MODULE_ENABLED */
/* #define HAL_NOR_MODULE_ENABLED */
/* #define HAL_PCCARD_MODULE_ENABLED */
#define HAL_SRAM_MODULE_ENABLED
/* #define HAL_SDRAM_MODULE_ENABLED */
/* #define HAL_HASH_MODULE_ENABLED */ 
#define HAL_GPIO_MODULE_ENABLED
#define HAL_I2C_MODULE_ENABLED
#define HAL_I2S_MODULE_ENABLED   
#define HAL_IWDG_MODULE_ENABLED
/* #define HAL_LTDC_MODULE_ENABLED */
#define HAL_PWR_MODULE_ENABLED   
#define HAL_RCC_MODULE_ENABLED
#define HAL_RNG_MODULE_ENABLED
#define HAL_RTC_MODULE_ENABLED
/* #define HAL_SAI_MODULE_ENABLED */   
/* #define HAL_SD_MODULE_ENABLED */ 
//#define HAL_SPI_MODULE_ENABLED
#define HAL_TIM_MODULE_ENABLED   
#define HAL_UART_MODULE_ENABLED
/* #define HAL_USART_MODULE_ENABLED */
/* #define HAL_IRDA_MODULE_ENABLED */
/* #define HAL_SMARTCARD_MODULE_ENABLED */
/* #define HAL_WWDG_MODULE_ENABLED */ 
#define HAL_CORTEX_MODULE_ENABLED
#define HAL_PCD_MODULE_ENABLED
/* #define HAL_HCD_MODULE_ENABLED */

and commenting-out e.g. HAL_SPI_MODULE_ENABLED makes zero difference to the binary size. So this is working as it should be.

One still ought to disable unused modules because if one needs to compile with -O0 for debugging, the code bloat could be a problem.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11263
  • Country: us
    • Personal site
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #144 on: November 05, 2022, 09:18:46 pm »
I just want it to reliably remove entire unused functions; that removes the vast majority of the ST libraries.
Whatever settings you do in the IDE make sure that it translates to the compiler getting "-fdata-sections -ffunction-sections" passed to it and  the linker gets "--gc-sections". Those are the only conditions for things to be removed.

If you want even more optimization look at LTO. On big projects it easily saves another 10-30%.

And while you refuse to believe that, you should never compiler with -O0 for anything. Use -Og for debugging, it still includes most of the stuff from -O1 with some optimizations that affect code generation disabled.
« Last Edit: November 05, 2022, 09:21:29 pm by ataradov »
Alex
 
The following users thanked this post: SiliconWizard

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14482
  • Country: fr
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #145 on: November 05, 2022, 09:34:45 pm »
As a side note for optimizations, you can use any level of opt. including -O3 and still generate debugging info using -g separately. Of course, optimized code can be harder to debug, but if you get specific issues in optimized code, that's a way of debugging it. Then be prepared to occasionally step into some assembly.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6264
  • Country: fi
    • My home page and email address
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #146 on: November 05, 2022, 10:01:54 pm »
I don't use ST Cube IDE, but exactly like Ataradov mentioned, in a Makefile-based system you only would need
    CFLAGS := -Wall -Og -flto -fdata-sections -ffunction-sections plus whatever else you need
    LDFLAGS := -Wl,--gc-sections,--relax plus whatever else you need
assuming your implicit compilation rule is something like
    %.o: %.c
            $(CC) $(CFLAGS) -c $^
and linkage uses something like
    firmware.bin: $(expression-expanding-to-object-file-list)
            $(CC) $(CFLAGS) $^ $(LDFLAGS) -o $@

When compiling Linux applications, I use almost exactly the same, except I prefer -O2 or -Os instead of -Og.  (And like SiliconWizard said above, you can use e.g. -Os -g or -O2 -g to get debugging information.  The -Og simply selects optimizations that should not impact debuggability, whereas -Os -g optimizes for size but includes debugging information; and -O2 -g applies a lot of optimizations (it's the highest optimization level I ever use) but still includes debugging information.)
« Last Edit: November 05, 2022, 10:03:52 pm by Nominal Animal »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3700
  • Country: gb
  • Doing electronics since the 1960s...
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #147 on: November 05, 2022, 10:05:33 pm »
Quote
And while you refuse to believe that, you should never compiler with -O0 for anything. Use -Og for debugging, it still includes most of the stuff from -O1 with some optimizations that affect code generation disabled.

Why get aggressive? What do I "refuse to believe"? I am trying to learn. This isn't some keyboard hitting contest.

As I said I normally use -Og but often a variable is shown as "optimised out", which usually looks like the variable is in a register or some such. One can work around that but -O0 avoids that.

Quote
The -Og simply selects optimizations that should not impact debuggability

That is probably true in GCC, except for the above.

Quote
-Os -g optimizes for size but includes debugging information

By "debugging information" do you mean all variables are visible?
« Last Edit: November 05, 2022, 10:17:42 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6264
  • Country: fi
    • My home page and email address
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #148 on: November 05, 2022, 11:04:47 pm »
By "debugging information" do you mean all variables are visible?
No. "By debugging information", I mean the code is instrumented for examination using a debugger.

Useful optimization – be it for size or efficiency or other reasons – always involves removal and transformation of expressions, and that means a variable may not exist at all in the final binary.  Typical example of this is as follows:
Code: [Select]
uint_fast32_t  distance(int_fast32_t x, int_fast32_t y)
{
    int_fast64_t  n2 = (int_fast64_t)x * x + (int_fast64_t)y * y;
    return uisqrt64(n2);
}
When optimizations are enabled, and especially if uisqrt64() is a static function eligible for inlining here, there is no reason to expect n2 to be observable in the code.

While there are many ways to force the variable to exist when debugging, that is always a tradeoff between code optimization and debugging.

I personally handle this dichotomy by writing any key pieces of code separately, and testing them thoroughly; documenting the test results.  For example, when dealing with uni-variate float functions, I often test the function with all finite inputs, and compare them to the expected results calculated at at least double precision.  For multi-variate functions, I test the pathological cases, and a few billion random cases (using high bits of Xorshift64* seeded from getrandom() on Linux, or from clock_gettime()/gettimeofday() multiplying both integral and fractional seconds by large primes and XORing the result together on others).

With this approach, my typical bugs are corner cases I didn't think of, and I get a headpalm and have a fix ready in a minute.  I rarely need to use a debugger on my code.  (I can, and have, including GDB accessors and helpers written in Python, when necessary; I'm just saying that having variables be accessible to me in a debugger is not important to me.)  I do, however, quite often examine the assembly code generated, to see if the writer of the problematic function and the compiler agree as to what it should really do.
 
The following users thanked this post: peter-h

Online eutectique

  • Frequent Contributor
  • **
  • Posts: 392
  • Country: be
Re: GCC ARM32 compiler too clever, or not clever enough?
« Reply #149 on: November 05, 2022, 11:07:43 pm »
By "debugging information" do you mean all variables are visible?

All variable that make it into the elf file. And not only variables, but macros as well, and perhaps more. Here is the list of debugging options with explanation.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf