Author Topic: will using functions add to memory ? (Read 11024 times)

Simon · « **on:** May 18, 2014, 03:22:54 pm »

Will using functions for portions of my code add to the program memory usage (if called once only)? my understanding of functions is that the contents of the function just replaces the function call. so basically assuming i use a function once only will calling the function use more program memory than putting the functions code straight into the program ?

madires · « **Reply #1 on:** May 18, 2014, 03:54:10 pm »

Quote from: Simon on May 18, 2014, 03:22:54 pm

Will using functions for portions of my code add to the program memory usage (if called once only)? my understanding of functions is that the contents of the function just replaces the function call. so basically assuming i use a function once only will calling the function use more program memory than putting the functions code straight into the program ?

Yes, it will increase the stack usage. When the function is called it will save some registers to the stack (push), i.e. the registers which are used locally in the function. Before returning, the function restores the used registers by retrieving them from the stack (pop). Also the program counter has to be tracked for each function call (stored in the stack again) and function parameters might also be pushed to the stack.

Simon · « **Reply #2 on:** May 18, 2014, 04:00:49 pm »

By stack do you mean ram ? or the general purpose registers ?, I mean is it much to be concerned about ?

mariush · « **Reply #3 on:** May 18, 2014, 04:03:31 pm »

I would say it depends on the compiler, how well it optimizes the source code.

I have right now a project open that uses XC8 1.31 free, and there is a function that's called only once in the main() function and looking at the disassembly, the compiler simply does a call to the address where the function is, instead of just adding its assembly inside the main() assembly code. So it's about 4 cycles lost (at 16 mhz clock, that's about 1us of wasted time) for the call and return... and about 3 bytes of extra flash memory usage.

Maybe XC8 pro optimizes it, I don't know.

madires · « **Reply #4 on:** May 18, 2014, 04:14:20 pm »

Quote from: Simon on May 18, 2014, 04:00:49 pm

By stack do you mean ram ? or the general purpose registers ?, I mean is it much to be concerned about ?

Yes, RAM. The stack is located in the RAM. If you exceed the maximum stack size you'll probably overwrite variables stored in RAM but a good compiler should warn you if it knows about the memory constraints. https://mbed.org/handbook/Memory-Model has some nice pictures explaining how the memory space is divided.

John Coloccia · « **Reply #5 on:** May 18, 2014, 04:19:01 pm »

Quote from: Simon on May 18, 2014, 04:00:49 pm

By stack do you mean ram ? or the general purpose registers ?, I mean is it much to be concerned about ?

When you call a function, it needs to store where the call came from (so it can go back when it's done), parameters and local variables that the function uses (must be stored each time or the function could never be re-entrant) and whatever value is returned. This all traditionally gets stored in the "stack", which is generally just a portion of RAM that's been allocated for stack space. Of course, the compiler is always free to do what it wants, but that's generally the case.

Even if you run out of stack space, it's generally just a matter of increasing it in the compiler options. Unless you have a reason to worry about it, I wouldn't worry about it.

Simon · « **Reply #6 on:** May 18, 2014, 04:31:36 pm »

ok doesn't sound like too much of a problem, I'm at 3% of RAM and I can certainly loose 3 bytes

zapta · « **Reply #7 on:** May 18, 2014, 04:46:46 pm »

Lookup the 'inline' keyword. If a function or method called once it should not increase the ran or flash size.

Sent from my Nexus 5 using Tapatalk

zapta · « **Reply #8 on:** May 18, 2014, 04:47:51 pm »

Duplicate. Deleted.

tjaeger · « **Reply #9 on:** May 18, 2014, 07:07:47 pm »

Usually it's enough to declare the function as static, this way the compiler knows the function is only used once and will inline it automatically.

Simon · « **Reply #10 on:** May 18, 2014, 07:16:52 pm »

how do I do that ?

bingo600 · « **Reply #11 on:** May 18, 2014, 07:56:42 pm »

Put static in front of it , and remember to pre declare it in the top.
And maybe even tell avrgcc to inline the function.

static inline uint8_t myfunc(void); //predeclare

static inline uint8_t myfunc(void)
{
..
..
..
}

Note that static means that it can only be seen/called within the ".c file" it is declared in , as it will NOT generate a "global" symbol for the linker.

If you call the function several times , the compiler will try to inline it every time , meaning larger code

/Bingo

tjaeger · « **Reply #12 on:** May 18, 2014, 08:26:11 pm »

Quote from: bingo600 on May 18, 2014, 07:56:42 pm

Put static in front of it , and remember to pre declare it in the top.
And maybe even tell avrgcc to inline the function.

static inline uint8_t myfunc(void); //predeclare

It's generally better to leave the decision on whether or not to inline the function up to the compiler. Unless maybe the function is used in a really tight loop or an ISR.

bingo600 · « **Reply #13 on:** May 18, 2014, 09:08:17 pm »

Quote from: tjaeger on May 18, 2014, 08:26:11 pm

Quote from: bingo600 on May 18, 2014, 07:56:42 pm
Put static in front of it , and remember to pre declare it in the top.
And maybe even tell avrgcc to inline the function.

static inline uint8_t myfunc(void); //predeclare

It's generally better to leave the decision on whether or not to inline the function up to the compiler. Unless maybe the function is used in a really tight loop or an ISR.

Who would inline a function if it wasn't tight code

/Bingo

John Coloccia · « **Reply #14 on:** May 18, 2014, 09:10:36 pm »

Quote from: tjaeger on May 18, 2014, 08:26:11 pm

Quote from: bingo600 on May 18, 2014, 07:56:42 pm
Put static in front of it , and remember to pre declare it in the top.
And maybe even tell avrgcc to inline the function.

static inline uint8_t myfunc(void); //predeclare

It's generally better to leave the decision on whether or not to inline the function up to the compiler. Unless maybe the function is used in a really tight loop or an ISR.

Bingo. Compilers generally do a much better job of optimizing code than programmers do. The exception to this is DSPs with strange instruction sets, cache management, memory alignment requirements, etc. Then you really have to know exactly what you're doing or you will botch things up royally. Also, it pays to know how your specific processor's memory management and cache management work if you want to get every drop of performance out of a piece of code. I've done a lot of real time coding, and processing is RARELY the problem. Nearly every problem ends up being I/O bound and you're always coming up against issues such as flushing the pipeline, caching the wrong data, etc etc etc. That's where the money is. The compiler does a fine job on it's own of figuring out how to handle mundane tasks such as function calls, loop unrolling, inlining and things like that.

Let me give you an example. Which of these will run faster, assuming no additional optimization:

for (i=0; i<SOME_NUMBER; i++)
{
Do_Something(i);
}

or

{
i=0
UnRolled;
DoSomething;
i++

UnRolled;
DoSomething;
i++

UnRolled;
DoSomething;
i++

UnRolled;
DoSomething;
i++

UnRolled;
DoSomething;
i++

UnRolled;
DoSomething;
i++

UnRolled;
DoSomething;
i++

UnRolled;
DoSomething;
i++
}

Most people still say the second way is obviously faster, but most of the time the first way actually has the advantage because it's quite likely that DoSomething() will be sitting in the instruction cache after the first call and the rest will execute blazingly fast, whereas the second way forces a bunch more memory to be read in...which is SLOOOOOOWWWWWWW. Also, churning on "localized" memory and spitting the results out at the end, as opposed to working on scattered memory...same thing. Intelligent memory management is where you get truly huge gains, and compilers are very poor at this because while they can optimize instructions very well, they're very poor at figuring out your semantics. How does it know what's OK and what's not? It generally doesn't and can't take full advantage of the architecture without intelligent layout from the designer to show the way.

Marco · « **Reply #15 on:** May 18, 2014, 09:44:11 pm »

If all was right in the world we would never even have to unroll code. But no ... x86 and ARM processor designers would rather blow vast amounts of hardware on branch prediction which is still guaranteed to get it wrong at least once in the loop rather than introduce/support instructions for zero overhead looping.

hamster_nz · « **Reply #16 on:** May 18, 2014, 10:19:39 pm »

Quote from: Marco on May 18, 2014, 09:44:11 pm

If all was right in the world we would never even have to unroll code. But no ... x86 and ARM processor designers would rather blow vast amounts of hardware on branch prediction which is still guaranteed to get it wrong at least once in the loop rather than introduce/support instructions for zero overhead looping.

You must be a software person... :-)

Pipelining is essential for the CPU to have the data it wants, where it wants it, when it wants it. Fetching a register takes a cycle 0.3ns, adding two numbers takes a cycle 0.3ns. storing a value into a register takes 0.3ns, all of which is hidden from the programmer.

I'm happy with my 3GHz CPU, even if an 64-bit IDIV takes 96 cycles

PeterG · « **Reply #17 on:** May 18, 2014, 11:30:41 pm »

Simon, you may find the following 2 videos helpfull regarding 'Stacks'.

Harvs · « **Reply #18 on:** May 19, 2014, 12:01:36 am »

Whilst it's good to be thinking about, until you actually hit issues of running out of resources or performance not meeting requirements, making clean, easily debugged and reusable code is more important.

In other words, if you're code is going to be more readable by extracting the function, do it.

Bassman59 · « **Reply #19 on:** May 19, 2014, 01:36:00 am »

Quote from: tjaeger on May 18, 2014, 07:07:47 pm

Usually it's enough to declare the function as static, this way the compiler knows the function is only used once and will inline it automatically.

Declaring a function as static doesn't tell the compiler that it's used only once and therefore can be inlined.

A static function has internal, that is, file-scope, linkage; as such it is not visible outside of the file in which it is declared. A static function foo() declared in one source file will be considered by the linker a totally different animal from a static function foo() declared in another source file.

If you want the compiler to consider inlining a function, use the inline keyword.

Marco · « **Reply #20 on:** May 19, 2014, 01:46:40 am »

Quote from: hamster_nz on May 18, 2014, 10:19:39 pm

Pipelining is essential for the CPU to have the data it wants, where it wants it, when it wants it. Fetching a register takes a cycle 0.3ns, adding two numbers takes a cycle 0.3ns. storing a value into a register takes 0.3ns, all of which is hidden from the programmer.

A branch mispredict takes 6, that's why we sometimes still have to unroll. There is no inherent conflict between pipelining and zero overhead looping (or code loadable BTBs like Cell has for that matter).

I didn't mean to present it as an either/or situation ... I meant that they rather throw more hardware at branch prediction for diminishing returns than create instructions and spend the tiny amount of hardware necessary for these kinds of corner cases which branch prediction can fundamentally not get right (until it becomes data dependent, but that will take a whole lot more hardware).

John Coloccia · « **Reply #21 on:** May 19, 2014, 02:10:18 am »

Marco, if they built x86 chips like DSPs, what would happen is caching and memory management would become very predictable, and only .01% of programmers would have a clue what the hell is going on and you'd have code that would grind to a halt on the fastest of processors...

...and everyone would scratch their head trying to figure out why the latest Intel Ultium chip (or whatever the hell they're calling it these days) doesn't work worth a damn.

There's a reason that competent engineers get payed well to do embedded and real-time design. It's not easy, and it's not the same skill set as cobbling together an accounting package.

Anyhow, any compiler worth a damn will unroll ENOUGH to satisfy getting the code optimally stuffed into the cache. It's not always a matter of unroll or don't unroll. Sometimes, you unroll a little bit and loop over that. Processor optimization is a very tricky business. I've done plenty of it by hand, plenty of it just letting the compiler go on it's own, and the best is always a mix of optimizing the I/O based on architecture and allowing the compiler to optimize the actual instructions.

IanB · « **Reply #22 on:** May 19, 2014, 02:35:23 am »

Quote from: Simon on May 18, 2014, 03:22:54 pm

Will using functions for portions of my code add to the program memory usage (if called once only)? my understanding of functions is that the contents of the function just replaces the function call. so basically assuming i use a function once only will calling the function use more program memory than putting the functions code straight into the program ?

A good principle to live by is that computer programs should be written for the benefit of human readers, so that programmers can readily understand what they are supposed to do and can easily modify them.

Subroutines usually make the code easier to understand, therefore subroutines are usually a good idea.

The time to worry about the potential overhead of subroutines in memory or performance is when (if ever) memory or performance become limiting. If you don't hit the limits it is better to write more maintainable code. And if you do hit the limits it may be better to upgrade the hardware than to spend time messing with the code to fit a quart into a pint pot.

Marco · « **Reply #23 on:** May 19, 2014, 03:28:20 am »

Quote from: John Coloccia on May 19, 2014, 02:10:18 am

only .01% of programmers would have a clue what the hell is going on and you'd have code that would grind to a halt on the fastest of processors.

It's not an either/or situation, take Cells solution to just let you load an address in the BTB ... the ability to do that doesn't remove the ability of the branch predictor to do it when you don't.

John Coloccia · « **Reply #24 on:** May 19, 2014, 03:39:49 am »

Quote from: Marco on May 19, 2014, 03:28:20 am

Quote from: John Coloccia on May 19, 2014, 02:10:18 am
only .01% of programmers would have a clue what the hell is going on and you'd have code that would grind to a halt on the fastest of processors.
It's not an either/or situation, take Cells solution to just let you load an address in the BTB ... the ability to do that doesn't remove the ability of the branch predictor to do it when you don't.

Yeah, that's a valid point. A lot of mainstream processors really could do a much better job of giving aware programmers, and maybe aware compilers, a fighting chance of getting some sort of higher performance out of them.

Or at least predictable performance.

And it always always always (nearly always) comes back to nearly everything in the high performance world being I/O bound.
The more control we have over memory and cache management, the easier it is for the guy who's really trying to squeeze performance out of a system.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: will using functions add to memory ? (Read 11024 times)

Share me