Author Topic: Size of STM32Cube generated code  (Read 7773 times)

0 Members and 1 Guest are viewing this topic.

Offline jyrgenTopic starter

  • Contributor
  • Posts: 14
  • Country: se
Size of STM32Cube generated code
« on: January 11, 2018, 01:04:26 pm »
Hi, I have previously made a PCB with an 8-bit AVR using some basic functionality (GPIO, ADC, I2C, SPI, Timer, USART). The final compiled code for the entire application is ~8Kb(text). I now want to do a new version of the board with the same functionality and have changed the MCU to a STM32L0. Since this is my first project I'm using STM32Cube to setup the pin functions and generate that code. When I compile this generated code using System workbench and optimize for size, the size is ~10Kb(text). I'm a beginner on these kind of processors, but is it normal that a code that just sets up the ADC, I2C, SPI, DAC, Timer and USART and does not do anything else is that large?
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4078
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: Size of STM32Cube generated code
« Reply #1 on: January 11, 2018, 01:33:29 pm »
Yes. Vendor library code is always bloated (size penalty) and abstracted (speed penalty) that it isn't compact anymore.
It was never intended to be that way, it's a library with no guarantees whatsoever, made as application example or quick prototyping to get going on today's vastly complicated microcontrollers.

Code will be quicker and mode compact if you cherry-pick the useful bits from the example code.
 
The following users thanked this post: jyrgen

Offline dgtl

  • Regular Contributor
  • *
  • Posts: 183
  • Country: ee
Re: Size of STM32Cube generated code
« Reply #2 on: January 11, 2018, 02:12:13 pm »
Yes, ST HAL is large.
For smaller and faster code, write directly to registers as you probably did for AVR. Or just accept the waste of flash space and go on, if the rest of your functionality fits in the flash. (usually the onboard flash sizes are much larger than on AVRs). Also, ARM Thumb2 binary code is not so compact as AVR for a lot of use cases, so don't expect too much. In addition, some flash is wasted on larger IRQ tables etc.

If you go the without-HAL route, the parts you want to take from generated project are:
* cmsis. A couple of headers having your cortex-m cpu specific macros, NVIC irq controller definitions etc.
* startup. Either C or ASM file having interrupt vector table with weak functions mapped to default handler; memory clear and flash to ram variable copy loop.
* linker map. LD file, which sets up the flash and ram memory layout.
* peripheral register header.
Having these, you can compile a simple program that runs your code in main().
The next step would be setting up the clocks. The simplest way would be to copy the required parts of code from ST HAL. AVR configured the clocks with fuse bits, for ARM processors, typically the CPU wakes up with internal oscillator and your code can configure the clocks as you want in runtime. Typical sequence is following: Start external high-speed osc (HSE), wait until it starts up, configure PLL, set PLL input to HSE, wait for PLL to lock, configure bus clock dividers (to avoid overclocking busses on clock switch), set up correct flash latency, switch to PLL clock.

Also, check the end of your uc reference manual. Some ST controller reference manuals have useful code examples there.
 
The following users thanked this post: jyrgen

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
Re: Size of STM32Cube generated code
« Reply #3 on: January 11, 2018, 02:27:54 pm »
Since this is my first project I'm using STM32Cube to setup the pin functions and generate that code. When I compile this generated code using System workbench and optimize for size, the size is ~10Kb(text). I'm a beginner on these kind of processors, but is it normal that a code that just sets up the ADC, I2C, SPI, DAC, Timer and USART and does not do anything else is that large?

Stm32Cube is useful tool to quickly get code running - for a price obviously. When you are set with pins and peripherals, then you can use your knowledge obtained and (re)write clean & compact initialization code based on http://www.st.com/en/embedded-software/stm32snippetsl0.html. There's also stm32f0 snippets. For big projects having plenty of performance/flash/ram it depends - you can stick with Stm32Cube as well.
 
The following users thanked this post: jyrgen

Offline Kjelt

  • Super Contributor
  • ***
  • Posts: 6460
  • Country: nl
Re: Size of STM32Cube generated code
« Reply #4 on: January 11, 2018, 02:49:34 pm »
Besides the points already made I would like to add that setting up a 32 bit micro as the STM32 with more and larger peripherals , clock routing and DMA stuff, even if done by the same SW engineer (so without the automated Cube/HAL bloat) will result in a three to four time larger code than an 8 bit STM8.
There is just way more to configure unless you skip configuring all unused peripherals and only use a few.
 
The following users thanked this post: jyrgen

Offline lucazader

  • Regular Contributor
  • *
  • Posts: 221
  • Country: au
Re: Size of STM32Cube generated code
« Reply #5 on: January 18, 2018, 05:31:40 am »
A few extra things to consider:

There are quite a few things that the Cube generated projects enable by default, that you may not use in your project.
A good example for this is the HAL will sometimes enable and use IRQ handlers what you may not explicitly use in your project. Quite often these can be removed with no impact on functionality, but a noticable reduction in code size.

By default your code will likely be setup to compile with little optimizations turned on. eg -Og.
If you instead use -Os (optimize for size) and especially LTO (link time optimization) you can greatly reduce code size.
In a current project of mine using an STM32F3 the debug build has a compile bin size of ~38KB. The release build (using Os and lto) is only ~26KB.

Another option for certain peripherals is to use the LL libraries that st has recently put out. These are lower level libraries that should result in smaller code size. They will offer less customization etc, but this may be a good trade off between development ease and code size for your application.
 

Offline pkplex

  • Contributor
  • Posts: 22
Re: Size of STM32Cube generated code
« Reply #6 on: January 27, 2018, 09:48:45 am »
I recently threw a hissy fit and ditched both the normal HAL and system workbench. Switched to the LL drivers, Makefile, and Sublime Text. Had to spend a while altering the Makefile to make it work with C++ and fix a few problems in it, add a flash target, etc. But I found switching from HAL to LL reduced the code size by about 300%. I think a very basic no additional code using HAL was ~6K, and the LL was ~2k.

 

Offline Warhawk

  • Frequent Contributor
  • **
  • Posts: 821
  • Country: 00
    • Personal resume
Re: Size of STM32Cube generated code
« Reply #7 on: January 27, 2018, 11:30:23 am »
I've started with STM32F0 recently, tried HAL, SPL, LL and ended up with CMISS. I really want to know what my MCU does. It also turned out that many things in libraries differ from version to version. If you want a professional and code-efficient code, stick with register access (CMISS). Code snippets from ST are great.

It may simply happen that bloatware libraries reduce code execution down to properly written code on 8-b.

Just my two cents.  ;)


Offline Kjelt

  • Super Contributor
  • ***
  • Posts: 6460
  • Country: nl
Re: Size of STM32Cube generated code
« Reply #8 on: January 27, 2018, 03:11:40 pm »
I've started with STM32F0 recently, tried HAL, SPL, LL and ended up with CMISS. I really want to know what my MCU does. It also turned out that many things in libraries differ from version to version. If you want a professional and code-efficient code, stick with register access (CMISS). Code snippets from ST are great.
It may simply happen that bloatware libraries reduce code execution down to properly written code on 8-b.
Just my two cents.  ;) 
What is CMISS ? You mean CMSIS (Cortex Microcontroller Software Interface Standard) ?
AFAIK that is good for the Arm cortex core and RTOS interfacing but not for the HAL interfacing, although some extra ports are available but not generic for all ST processor AFAIK ?
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8169
  • Country: fi
Re: Size of STM32Cube generated code
« Reply #9 on: January 27, 2018, 03:21:34 pm »
Write it like you do with AVR. All the registers are documented in the Reference Manual. The code size will be small, and code will be more readable. The STM32 peripheral library is just pure horror, not only for bloated size, but it's buggy, hard to use and unreadable as well.

Some extra brainwork is needed, for example, you need to remember to enable GPIO clocks from the RCC peripheral, but catches like this exist in the AVR as well. Some peripherals are more capable and hence, slightly more complex to initialize compared to an AVR, but nothing too shocking going on typically.

For autogenerated stuff, why not, as long as it does the job easily and in no time; but since you are asking about it, the chances are: just ditch autogenerated shit and get some work done.
« Last Edit: January 27, 2018, 03:24:12 pm by Siwastaja »
 

Offline Warhawk

  • Frequent Contributor
  • **
  • Posts: 821
  • Country: 00
    • Personal resume
Re: Size of STM32Cube generated code
« Reply #10 on: January 27, 2018, 05:00:53 pm »
I've started with STM32F0 recently, tried HAL, SPL, LL and ended up with CMISS. I really want to know what my MCU does. It also turned out that many things in libraries differ from version to version. If you want a professional and code-efficient code, stick with register access (CMISS). Code snippets from ST are great.
It may simply happen that bloatware libraries reduce code execution down to properly written code on 8-b.
Just my two cents.  ;) 
What is CMISS ? You mean CMSIS (Cortex Microcontroller Software Interface Standard) ?
AFAIK that is good for the Arm cortex core and RTOS interfacing but not for the HAL interfacing, although some extra ports are available but not generic for all ST processor AFAIK ?

Correct, I meant CMSIS  :)

Offline julianhigginson

  • Frequent Contributor
  • **
  • Posts: 783
  • Country: au
Re: Size of STM32Cube generated code
« Reply #11 on: February 03, 2018, 04:56:06 am »
Stm32 HAL is pretty heavy going, but the whole point is, you can use it a lot of ways (learning ADC HAL recently was HELL) and, well, it does mostly work... Then in future you can take a project targeted to one STM32 and map it to a new one... And it should work.. even across series.

The biggest problem with stm32cube and HAL and LL libraries, is that they give you all this power to define and setup things in so many ways, but cube in particular gives you no pointers as to what things make sense, and what things plain won't work..  but really that's a common issue with all their documentation, too.. it drowns you in information, but gives no pointers at all on how to combine the information to do anything useful.

The best they do is give you some example projects, but those are trivial to the point of being useless most of the time..
 

Offline andyturk

  • Frequent Contributor
  • **
  • Posts: 895
  • Country: us
Re: Size of STM32Cube generated code
« Reply #12 on: February 06, 2018, 12:45:18 am »
The biggest problem with stm32cube and HAL and LL libraries, is that they give you all this power to define and setup things in so many ways, [...]
Agreed. It's a problem that's not unique to STM32.

Most OEM libraries are designed include APIs for all the functionality offered by the hardware. That's usually a lot of stuff to jam into one API (on modern mcu, anyway), and the biggest reason (IMO) why OEM libraries are unwieldy. In the case of the STM32 ADC, the HAL layer for that is super complex because the ADC is super complex.

I've had good luck just going right to the registers with the ADC because I only need a few specific things done.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8169
  • Country: fi
Re: Size of STM32Cube generated code
« Reply #13 on: February 07, 2018, 09:02:37 am »
The actual problem with STM32 libraries is that they do not provide any encapsulation or abstraction whatsoever - so you just need to somehow "know" each and every bit and piece you need to do to initialize a device. And in addition, you need to know how to handle the library overhead, which is typically around 1000%.

So what you end up with is either of these two:
1) you copypaste other's code
2) you use an autogeneration tool

At this point, you could have just as well copypasted (or autogenerated) a more compact, robust, and readable code without the extra overhead from the "library".

And guess what, after copypasting readable, meaningful, compact and efficient code for a few times, you start learning, and need less copypasting in the future.

The complex library also gives the false impression that everything underneath must be "super complex". For example, the STM32 ADCs are far from super complex - initializing it properly is around 5-6 lines of code, no big catches, nothing strange going on - only a few more operations than on a comparable AVR (around 3 lines). Sure, when you do it through the library, it's so many lines (typ. 20-30), full of long names, that the only way around is copypasting or autogenerating.

In fact, looking at the ADC initialization code now, it's not much different from the AVR ADC at all. The only "extras" are that you need to enable the ADC from the RCC registers first, and change the input pin to analog mode ("11" in MODER bits, similar to DDR in AVR, but two bits per pin because of more functionality). Then you need to set the ADC prescaler, channel(s) you want to convert, enable the ADC and give a start trigger, just like in AVR. If you want DMA, it's 7 lines of initialization more, but saves it back elsewhere since you don't need interrupts nor polling at all, often not any kind of ADC control code after the init!

A proper library would have an usable interface and would deal with register-level stuff internally. In STM32, they mostly wrap each and every register operation for you to make inefficiently.

The solution is simple: don't use those libraries. The "AVR-like" traditional access is, in almost all cases:
1) definitely easier than using the libraries, by an order of magnitude
2) easy enough that you can stop resorting to copypasting/autogenerating after gaining a bit of experience (100% impossible to do with the libraries - even if you actually learned each character needed by heart, just typing everything out would take too long!)

If you need complex network or USB filesystem stacks, etc., that would be different.
« Last Edit: February 07, 2018, 09:18:17 am by Siwastaja »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf