Author Topic: function overhead (Read 851 times)

Kittu20 · « **on:** January 13, 2024, 03:20:49 pm »

I'm trying to understand the concept of inline functions versus regular functions and got stuck on the term 'function overhead.' In regular functions in C, I understand the stack memory allocation and release when a function is called and exits. For instance, we can have a function that take argument, increments a value, and doesn't return.

What exactly would the function overhead be in this case?

TheCalligrapher · « **Reply #1 on:** January 13, 2024, 05:09:08 pm »

In what "this case"? In case of inline functions?

Firstly, if a specific function call gets "inlined" (i.e. substituted into the calling code), there's no overhead. Or, one might say, there's no reason for any overhead. However, how well a compiler can substitute a function call is a Quality-of-Implementation issue. Some compilers might be better at it, some worse... Just remember that this kind of substitution is not a function property, it is a property of each specific call. The decision to substitute is made on per-call basis. Some calls might get substituted, other calls (to the same function) might not. And when a specific call gets substituted, it basically "dissolves" into the surrounding (calling) code and gets optimized in the context of calling code. So, how it will look in the end depends a lot on the context as well.

Secondly, remember that all the compiler needs to in order to substitute a function call is to see the original function definition (the function body). If the compiler has access to full function body, it can substitute calls to this function. It does not matter whether the function is declared `inline` or not. Most modern compilers simply ignore the keyword `inline` when making decisions about call substitution. Instead, they apply their own criteria when deciding whether to substitute a call. Any function whose body is visible to the compiler is a potential candidate for substitution.

For example, a function with internal linkage (a `static` function) is always defined in the same translation unit in which is it called (called directly). This immediately means that when it comes to call substitution, declaring a function `static` is effectively equivalent to declaring it `inline`.

shapirus · « **Reply #2 on:** January 13, 2024, 05:24:49 pm »

Quote from: TheCalligrapher on January 13, 2024, 05:09:08 pm

It does not matter whether the function is declared `inline` or not. Most modern compilers simply ignore the keyword `inline` when making decisions about call substitution.

...at least by default, or with the recommended optimisation arguments. There are ways to specify inlining or not inlining explicitly.

brucehoult · « **Reply #3 on:** January 13, 2024, 10:00:12 pm »

Quote from: Kittu20 on January 13, 2024, 03:20:49 pm

I'm trying to understand the concept of inline functions versus regular functions and got stuck on the term 'function overhead.' In regular functions in C, I understand the stack memory allocation and release when a function is called and exits. For instance, we can have a function that take argument, increments a value, and doesn't return.

What do you mean "doesn't return"? It's not a function if it doesn't return.

If you have a C function such as ...

Code: [Select]

int inc(int i) {
  return i + 1;
}

... then a decent modern ISA such as RISC-V or Arm (either 32 bit or 64 bit) will have two instructions in the function:

Code: [Select]

inc:
    add a0,a0,1
    ret

It takes one instruction to call the function, and one to return, so you could say the overhead is two instructions. It might take several clock cycles for each of those control flow changes on a simple CPU, or just one cycle on one with instruction cache and/or branch target cache. Mid-range CPUs and up (Arm A53, SiFive U74, Intel Pentium, PowerPC 601/603) will execute both the add and ret in the same clock cycle.

If the instruction set is such that the return address must be written to RAM and then read back on returning then that adds more overhead. So do instruction sets (e.g. VAX CALLS/CALLG) or ABIs that require setting up and tearing down a stack frame even for trivial functions.

There is also overhead in the calling function to consider. You may need to copy the function argument to the correct register (or worse, push it on to the stack) and copy the result somewhere else to preserve it before the next function call. Both of those are not necessary if the function is inlined.

Even worse, if this is the only function call made by the caller, then this may force the caller to set up a stack frame, and save and restore the return address and other registers.

golden_labels · « **Reply #4 on:** January 13, 2024, 10:57:51 pm »

It’s ok to be curious, but don’t get fixated on this topic. Outside small microcontroller programming it’s going to be of almost no importance. This is not 1980s anymore. What compilers generate, how processors work, and in what way environment affects performance of the program changed greatly in the past four decades. Functions small enough to measurably benefit from either choice are also rare.

If your goal are microcontrollers like AVR, PIC, 8051, then skip my post entirely. There things are a bit different and I can’t give you any advice or explain things better, than what you read so far. Otherwise, if that’s for general programming, continue.

You might’ve heard a story going like this. A “normal” function call “is slow”, because arguments are being put on the stack, then a function is called, then they are taken from the stack, then a result is written, then the func… blebleh. Versus an “inline” function skips all this and “is blazing fast”. Yes, this is how it was with 386 and compilers almost literally replacing C fragments with machine code. Such things don’t exist for 10–20 years.⁽¹⁾

Nowdays compilers generate machine code independent of the original source code structure, weighting a lot of factors, backed by years of experience. Machines have pipelining, advanced prefetching, caching, speculative execution, can plan and prallelize stuff to avoid idling. Execution environments add their own factors. Not only inlining is done by compilers as needed, just sometimes needing hints from the programmer, but it can have detrimential effect on performance by how it interacts with caches, paging, and branch prediction. Just note how rarely functions are marked as inlined in actual software to get the hint.⁽²⁾

Satisfy curiosity and extending general knowledge is great. But to properly understand overhead associated with code using inlining or not, you will need to read the final machine code. And understand how it interacts with the machine and the environment. In 2020s there is no way around this. While a very simplistic code C may give something looking like the 1980-esque description, in general this is not true. For example any sane compiler is going to optimize tail recursion into something resembling nothing like that. Non-inlined functions will make use of modern calling conventions, inlined functions will get interleaved with surrounding code, a ton of unexpected operations may appear due to non-obvious constraints. And to understand the actual gain (or loss) from inlining it’s crucial to see the call in the context of other instructions.

⁽¹⁾ There do exist “trivial” C compilers. But nobody uses them, if performance is relevant.
⁽²⁾ And nowadays keyword inline is not forcing inlining. Compiler-specific features must be used, like GCC’s always_inline attribute. Otherwise the compiler treats it as a suggestion.

Siwastaja · « **Reply #5 on:** January 14, 2024, 07:34:29 pm »

Quote from: DiTBho on January 14, 2024, 01:55:19 pm

in short, to minimize the number of times I would have to invoke the "read()" function I should make the buffer as large as possible, while to process a data block optimally I have to process it character by character, so I would be tempted to make it smaller possible.

How confusing. There is overhead with a read() call, possibly significant, but it has absolutely nothing to do with this topic, which is about function call overhead. Function call overhead of read() is maybe 0.000001% of the total overhead.

DiTBho · « **Reply #6 on:** January 14, 2024, 07:59:56 pm »

Quote from: Siwastaja on January 14, 2024, 07:34:29 pm

How confusing

You have no idea about the kernel pressure when you invoke functions like filesystem read().
The idea behind of my post was to point out that the OverHead is always "how much extra stuff you need to do in order to get something", meaning the resources (CPU cycles, ram, etc) required to set up an operation.

edit:
Posts deleted.

TheCalligrapher · « **Reply #7 on:** January 14, 2024, 08:29:48 pm »

Quote from: shapirus on January 13, 2024, 05:24:49 pm

Quote from: TheCalligrapher on January 13, 2024, 05:09:08 pm
It does not matter whether the function is declared `inline` or not. Most modern compilers simply ignore the keyword `inline` when making decisions about call substitution.
...at least by default, or with the recommended optimisation arguments. There are ways to specify inlining or not inlining explicitly.

Yes, that's true. But even that is not necessarily focused on the `inline` keyword... The keyword is almost lost in the variety of other factors. For example, one of the most basic and obvious call substitution logics is: substitute calls to static functions called only once.

The primary purpose of `inline` keyword in modern C and C++ is to override ODR restrictions, not to hint at call substitution. I.e. `inline` exists primarily to allow one to create multiple definitions of entities with external linkage and not get barked at by the linker.

SiliconWizard · « **Reply #8 on:** January 14, 2024, 09:26:38 pm »

I suspect the question is some kind of homework, at least it looks like it. Very generic question which is ill-constructed and probably expects a very generic and basic answer - typical introductory academic material that will drive any engineer nuts.

First, I guess that by "function overhead", what is really meant is "function call overhead". Otherwise that doesn't mean much. A first-level answer to that is rather simple, it's probably just meant to teach how a function call works in a typical basic programming model (parameter passing, branch, context saving if required, some processing, context restoring if required, then branch to return.)

Of course that's just basic, introductory material. The actual implementation of a function call in a given language, in a given code context, for a given target, using a given compiler with given optimizations can vary wildly and may have nothing to do with the above.

That reminds me of many basic college course questions. You strongly suspect that the expected answer is pretty basic, but you know that it's a wrong answer to a wrong question.

Siwastaja · « **Reply #9 on:** January 15, 2024, 10:40:07 am »

Quote from: SiliconWizard on January 14, 2024, 09:26:38 pm

I suspect the question is some kind of homework,

Just an AI filling in the gaps, like all these weirdly generic-but-specific questions posted under Indian flag in perfect English starting about a year ago.

magic · « **Reply #10 on:** January 15, 2024, 05:26:25 pm »

Weird homework questions from certain flags are a phenomenon which predates high quality AI generators.
I.e, it seems to be actual humans posting them. If they did it in the past, they may still be doing it today.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: function overhead (Read 851 times)

Kittu20

function overhead

TheCalligrapher

Re: function overhead

shapirus

Re: function overhead

brucehoult

Re: function overhead

golden_labels

Re: function overhead

Siwastaja

Re: function overhead

DiTBho

Re: function overhead

TheCalligrapher

Re: function overhead

SiliconWizard

Re: function overhead

Siwastaja

Re: function overhead

magic

Re: function overhead

Share me