Author Topic: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)  (Read 16140 times)

0 Members and 1 Guest are viewing this topic.

Offline jnz

  • Frequent Contributor
  • **
  • Posts: 465
See the attachment.

Project wasn't terrible bloated, it was looking OK. Then I added USART, which is pretty small.... But it requires DMA... which is a F#(#ยก|V@ PIG. Like worse than I could have imagined. I'm not even using DMA, and according to the map and final program size it's taking 24KB alone. This is for a bootloader so that's clearly on the list of things that aren't going to happen!

What I don't get it is, why. Why is DMA I'm not using 24.25k?! If I turn Keil's Level3 optimizing on, I can get it to 20k total down from 35k, but that's still ridiculous for something I wanted to keep under 8k when it's done vs 35k before I even add my code to it.

I know the popular opinion is going to be WRITE IT YOURSELF, which I do agree with, but that's not really the best use of my time right now. I might roll back to the std peripheral libs if I have to. Maybe I find another option, but I wanted to check thoughts here first.

Anyone?  :-//
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 3123
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #1 on: October 02, 2015, 12:35:29 am »
Surely there is a UART support variation that doesn't suck in DMA?
or, if you're sure that you won't use DMA, you can replace the DMA code with 'stubs' - multiple symbols all defined to the same "return" instruction.
 

Offline ralphd

  • Frequent Contributor
  • **
  • Posts: 442
  • Country: ca
    • Nerd Ralph
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #2 on: October 02, 2015, 01:40:31 am »
Interesting. I've been thinking of trying stm32 starting with the f030.  Maybe I'll hold off on that idea since I like efficent coding.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 3123
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #3 on: October 02, 2015, 01:52:15 am »
Ralph: you wouldn't have used the HAL libraries, anyway.
 

Offline jnz

  • Frequent Contributor
  • **
  • Posts: 465
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #4 on: October 02, 2015, 02:01:14 am »
I'm not sure what is going on.

Yea, if there is a version of USART/UART that omits the DMA I can't find it! I just pretty much can't believe that it brings 24k of DMA in even if I don't have a single piece of USART code written let alone anything DMA!

No joke, I'd be pretty pissed if just the DMA is 24k and I was using it.

I had the same thought to gut the DMA parts. I'm certain that'll cause hell in the USART code.

Edit: I'll give hacking the DMA a try, but I think the better course of action might be to just get a better USART lib. I know there are some out there. I'm just not understanding how their HAL wouldn't even work for any of their chips under 48K rom.

It's not too late, I'm considering seeing if NXP has a chip that would work for this project.
« Last Edit: October 02, 2015, 02:04:18 am by jnz »
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 6616
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #5 on: October 02, 2015, 02:33:22 am »
The source for those functions can be found here:

https://my.st.com/public/STe2ecommunities/mcu/Lists/STM32Java/Attachments/955/stm32f4xx_hal_dma.c
https://my.st.com/public/STe2ecommunities/mcu/Lists/STM32Java/Attachments/955/stm32f4xx_hal_dma.h

From a cursory inspection, that code is pretty disgusting. Plenty of duplicated code, and most of the bloat is coming from __HAL_DMA_GET_*_FLAG_INDEX(hdma) macros which expands to a dozen-case if/else conditional each time it's used(!), despite hdma never actually changing thorughout the function. There's also a "if (x) ... else if(!x) ..." :wtf: Here's a quick 5-minute rewrite of the top two bloaters... see how much it changes the size.

Code: [Select]
HAL_StatusTypeDef HAL_DMA_PollForTransfer(DMA_HandleTypeDef *hdma, uint32_t CompleteLevel, uint32_t Timeout)
{
  uint32_t temp, tmp, tmp1, tmp2;
  uint32_t tickstart = 0;

  uint32_t tcfi = __HAL_DMA_GET_TC_FLAG_INDEX(hdma);
  uint32_t htfi = __HAL_DMA_GET_HT_FLAG_INDEX(hdma);
  uint32_t tefi = __HAL_DMA_GET_TE_FLAG_INDEX(hdma);
  uint32_t fefi = __HAL_DMA_GET_FE_FLAG_INDEX(hdma);
  uint32_t dmefi = __HAL_DMA_GET_DME_FLAG_INDEX(hdma);

  uint32_t ready0, ready1;
  /* Get the level transfer complete flag */
  temp = (CompleteLevel == HAL_DMA_FULL_TRANSFER) ?
   tcfi /* Transfer Complete flag */
  :htfi;/* Half Transfer Complete flag */

  /* Get tick */
  tickstart = HAL_GetTick();

  while(__HAL_DMA_GET_FLAG(hdma, temp) == RESET)
  {
    tmp  = __HAL_DMA_GET_FLAG(hdma, tefi);
    tmp1 = __HAL_DMA_GET_FLAG(hdma, fefi);
    tmp2 = __HAL_DMA_GET_FLAG(hdma, dmefi);
    if((tmp != RESET) || (tmp1 != RESET) || (tmp2 != RESET))
    {
      if(tmp != RESET)
      {
        /* Update error code */
        hdma->ErrorCode |= HAL_DMA_ERROR_TE;

        /* Clear the transfer error flag */
        __HAL_DMA_CLEAR_FLAG(hdma, tefi);
      }
      if(tmp1 != RESET)
      {
        /* Update error code */
        hdma->ErrorCode |= HAL_DMA_ERROR_FE;
 
        /* Clear the FIFO error flag */
        __HAL_DMA_CLEAR_FLAG(hdma, fefi);
      }
      if(tmp2 != RESET)
      {
        /* Update error code */
        hdma->ErrorCode |= HAL_DMA_ERROR_DME;

        /* Clear the Direct Mode error flag */
        __HAL_DMA_CLEAR_FLAG(hdma, dmefi);
      }
      /* Change the DMA state */
      hdma->State= HAL_DMA_STATE_ERROR;
     
      /* Process Unlocked */
      __HAL_UNLOCK(hdma);

      return HAL_ERROR;
    } 
    /* Check for the Timeout */
    if(Timeout != HAL_MAX_DELAY)
    {
      if((Timeout == 0)||((HAL_GetTick() - tickstart ) > Timeout))
      {
        /* Update error code */
        hdma->ErrorCode |= HAL_DMA_ERROR_TIMEOUT;

        /* Change the DMA state */
        hdma->State = HAL_DMA_STATE_TIMEOUT;

        /* Process Unlocked */
        __HAL_UNLOCK(hdma);
       
        return HAL_TIMEOUT;
      }
    }
  }

  /* Clear the half transfer complete flag */
  __HAL_DMA_CLEAR_FLAG(hdma, htfi);
  if(CompleteLevel == HAL_DMA_FULL_TRANSFER)
  {
    /* Clear the transfer complete flag */
    __HAL_DMA_CLEAR_FLAG(hdma, tcfi);
    ready0 = HAL_DMA_STATE_READY_MEM0;
    ready1 = HAL_DMA_STATE_READY_MEM1;
  }
  else
  {
    ready0 = HAL_DMA_STATE_READY_HALF_MEM0;
    ready1 = HAL_DMA_STATE_READY_HALF_MEM1;
  }
  hdma->State = (hdma->Instance->CR & (uint32_t)(DMA_SxCR_DBM)) ?      /* Multi_Buffering mode enabled */
    (hdma->Instance->CR & DMA_SxCR_CT) ? ready1 : ready0
    : ready0; /* The selected Streamx EN bit is cleared (DMA is disabled and all transfers
  if(CompleteLevel == HAL_DMA_FULL_TRANSFER)
  {
    /* Process Unlocked */
    __HAL_UNLOCK(hdma);
  }
  return HAL_OK;
}

void HAL_DMA_IRQHandler(DMA_HandleTypeDef *hdma)
{
  uint32 i;
  uint32_t tefi = __HAL_DMA_GET_TE_FLAG_INDEX(hdma);
  uint32_t fefi = __HAL_DMA_GET_FE_FLAG_INDEX(hdma);
  uint32_t dmefi = __HAL_DMA_GET_DME_FLAG_INDEX(hdma);
  uint32_t htfi = __HAL_DMA_GET_HT_FLAG_INDEX(hdma);
  uint32_t tcfi = __HAL_DMA_GET_TC_FLAG_INDEX(hdma);

  struct {
   uint32_t i;
   uint32_t it;
   uint32_t err;
  } tefedme[] = {
   { tefi ,  DMA_IT_TE, HAL_DMA_ERROR_TE },
   { fefi ,  DMA_IT_FE, HAL_DMA_ERROR_FE },
   { dmefi, DMA_IT_DME, HAL_DMA_ERROR_DME }
  };
  /* Transfer Error Interrupt management ***************************************/
  /* FIFO Error Interrupt management ******************************************/
  /* Direct Mode Error Interrupt management ***********************************/
  for(i=0;i<3;i++) {
    if(__HAL_DMA_GET_FLAG(hdma, tefedme[i].i) != RESET
    && __HAL_DMA_GET_IT_SOURCE(hdma, tefedme[i].it) != RESET)
    {
      /* Disable the interrupt */
      __HAL_DMA_DISABLE_IT(hdma, tefedme[i].it);
      /* Clear the transfer error flag */
      __HAL_DMA_CLEAR_FLAG(hdma, tefedme[i].i);
      /* Update error code */
      hdma->ErrorCode |= tefedme[i].err;
      /* Change the DMA state */
      hdma->State = HAL_DMA_STATE_ERROR;
      /* Process Unlocked */
      __HAL_UNLOCK(hdma);
      if(hdma->XferErrorCallback != NULL)
      {
        /* Transfer error callback */
        hdma->XferErrorCallback(hdma);
      }
  }

  /* Half Transfer Complete Interrupt management ******************************/
  if(__HAL_DMA_GET_FLAG(hdma, htfi) != RESET)
  {
    if(__HAL_DMA_GET_IT_SOURCE(hdma, DMA_IT_HT) != RESET)
    {
      /* Multi_Buffering mode enabled */
      if(((hdma->Instance->CR) & (uint32_t)(DMA_SxCR_DBM)) != 0)
      {
        /* Clear the half transfer complete flag */
        __HAL_DMA_CLEAR_FLAG(hdma, htfi);

        /* Current memory buffer used is Memory 0 */
        if((hdma->Instance->CR & DMA_SxCR_CT) == 0)
        {
          /* Change DMA peripheral state */
          hdma->State = HAL_DMA_STATE_READY_HALF_MEM0;
        }
        /* Current memory buffer used is Memory 1 */
        else
        {
          /* Change DMA peripheral state */
          hdma->State = HAL_DMA_STATE_READY_HALF_MEM1;
        }
      }
      else
      {
        /* Disable the half transfer interrupt if the DMA mode is not CIRCULAR */
        if((hdma->Instance->CR & DMA_SxCR_CIRC) == 0)
        {
          /* Disable the half transfer interrupt */
          __HAL_DMA_DISABLE_IT(hdma, DMA_IT_HT);
        }
        /* Clear the half transfer complete flag */
        __HAL_DMA_CLEAR_FLAG(hdma, __HAL_DMA_GET_HT_FLAG_INDEX(hdma));

        /* Change DMA peripheral state */
        hdma->State = HAL_DMA_STATE_READY_HALF_MEM0;
      }

      if(hdma->XferHalfCpltCallback != NULL)
      {
        /* Half transfer callback */
        hdma->XferHalfCpltCallback(hdma);
      }
    }
  }
  /* Transfer Complete Interrupt management ***********************************/
  if(__HAL_DMA_GET_FLAG(hdma, tcfi) != RESET)
  {
    if(__HAL_DMA_GET_IT_SOURCE(hdma, DMA_IT_TC) != RESET)
    {
      if(((hdma->Instance->CR) & (uint32_t)(DMA_SxCR_DBM)) != 0)
      {
        /* Clear the transfer complete flag */
        __HAL_DMA_CLEAR_FLAG(hdma, tcfi);

        /* Current memory buffer used is Memory 1 */
        if((hdma->Instance->CR & DMA_SxCR_CT) == 0)
        {
          if(hdma->XferM1CpltCallback != NULL)
          {
            /* Transfer complete Callback for memory1 */
            hdma->XferM1CpltCallback(hdma);
          }
        }
        /* Current memory buffer used is Memory 0 */
        else
        {
          if(hdma->XferCpltCallback != NULL)
          {
            /* Transfer complete Callback for memory0 */
            hdma->XferCpltCallback(hdma);
          }
        }
      }
      /* Disable the transfer complete interrupt if the DMA mode is not CIRCULAR */
      else
      {
        if((hdma->Instance->CR & DMA_SxCR_CIRC) == 0)
        {
          /* Disable the transfer complete interrupt */
          __HAL_DMA_DISABLE_IT(hdma, DMA_IT_TC);
        }
        /* Clear the transfer complete flag */
        __HAL_DMA_CLEAR_FLAG(hdma, tcfi);

        /* Update error code */
        hdma->ErrorCode |= HAL_DMA_ERROR_NONE;

        /* Change the DMA state */
        hdma->State = HAL_DMA_STATE_READY_MEM0;

        /* Process Unlocked */
        __HAL_UNLOCK(hdma);     

        if(hdma->XferCpltCallback != NULL)
        {
          /* Transfer complete callback */
          hdma->XferCpltCallback(hdma);
        }
      }
    }
  }
}
The second part of HAL_DMA_IRQHandler could probably be shrunk down similarly, I'll leave someone else to try that. :)
 

Offline jnz

  • Frequent Contributor
  • **
  • Posts: 465
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #6 on: October 02, 2015, 03:12:30 am »
Wow, thanks amyk! That's a ton more work than people usually put into forum posts. I'll grab that and move it outside of Keil's manager and see if it's smaller.

Even still... I don't have much confidence in the HAL right now! I'm thinking of just rolling back Keil to use the std peripheral library which isn't perfect but it's better. I'll update tomorrow.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 3123
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #7 on: October 02, 2015, 07:34:19 am »
Are you sure that you have the equivalent of "garbage collect sections" configured?
Does Keil have a call tree analyzer so you can see why functions are being included?
 

Offline Chris C

  • Frequent Contributor
  • **
  • Posts: 259
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #8 on: October 02, 2015, 02:14:41 pm »
At a minimum, I wouldn't be able to stop myself from rewriting those horrible __HAL_DMA_GET_*_FLAG_INDEX(hdma) macros [amyk] pointed out, to be based on lookup tables instead.  Looks like you'll need one table for the DMA1_Stream* address values that can be used for all the macros, then will need a separate table for each macro for the DMA_FLAG_* bitmask values.

If I'd written this from scratch or wanted to go further, I'd use the same lookup arrays, but only to look up all the required values for a particular channel ONCE upon channel initialization; and store them in the DMA_HandleTypeDef struct for future reference.  Then even array offset calculations aren't needed.  Getting a DMA flag is just a matter of getting the address and bitmask, both from a fixed offset in the struct, which is fast and efficient.

Not sure how bootloader code works on ARM.  But if once DMA is initted, you always exit the bootloader with a soft reset, then you might not need HAL_DMA_DeInit.
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 6616
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #9 on: October 02, 2015, 03:26:52 pm »
You're welcome, the fact that it only took me around 10 minutes to rewrite those functions much better shows just how much effort they originally spent on it. ::)
I'm not sure what is going on.

Yea, if there is a version of USART/UART that omits the DMA I can't find it! I just pretty much can't believe that it brings 24k of DMA in even if I don't have a single piece of USART code written let alone anything DMA!

No joke, I'd be pretty pissed if just the DMA is 24k and I was using it.

I had the same thought to gut the DMA parts. I'm certain that'll cause hell in the USART code.

Edit: I'll give hacking the DMA a try, but I think the better course of action might be to just get a better USART lib. I know there are some out there. I'm just not understanding how their HAL wouldn't even work for any of their chips under 48K rom.

It's not too late, I'm considering seeing if NXP has a chip that would work for this project.
Thinking about this a little more I believe the reason why it's pulling in DMA is that the USART functions are written such that they can use DMA when asked to, with something like "if(usart->use_dma) ..." in the code, so even if you never touch the DMA at runtime the compiler and linker is not intelligent enough to figure out and remove the associated code, which then makes it pull in the rest of the DMA stuff and all that it depends on too.

I don't think a lookup table is even needed for those macros; the DMA controller registers have a very regular structure (even the datasheet shows this), so all you need is the address and you can get the stream/channel number from the middle bits of it, which can then be used as shift amounts.

I also think I found a bug :palm:
Code: [Select]
#define DMA_FLAG_FEIF0_4                    ((uint32_t)0x00800001)
#define DMA_FLAG_DMEIF0_4                   ((uint32_t)0x00800004)
That 8 is where a reserved bit would be. Maybe it wasn't noticed since that bit always reads 0, but this is the sort of stuff that wouldn't happen if they just generated these bitmasks using a macro.
« Last Edit: October 02, 2015, 03:29:18 pm by amyk »
 

Offline newbrain

  • Frequent Contributor
  • **
  • Posts: 795
  • Country: se
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #10 on: October 02, 2015, 05:26:20 pm »
Methinks it depends on the linker options.
HAL is bloated, but what's not used is generally not brought in, at least IME.
In the picture, all HAL enabled in stm32f4xx_hal_conf.h, GPIO (for the UART pins) and UART actually used (yes it's VS, but the compiler is gcc).

Code: [Select]
1>  ------------------- Memory utilization report -------------------
1>  Used FLASH: 4592 bytes out of 512KB (0%)
1>  Used RAM: 52 bytes out of 128KB (0%)
========== Build: 1 succeeded, 0 failed, 0 up-to-date, 0 skipped ==========
Nandemo wa shiranai wa yo, shitteru koto dake.
 

Offline Yansi

  • Super Contributor
  • ***
  • Posts: 3011
  • Country: 00
  • STM32, STM8, AVR, 8051
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #11 on: October 02, 2015, 05:39:33 pm »
Hry gus. check in keil that the option "one function per elf section" is checked. And see how it will help
 

Offline jnz

  • Frequent Contributor
  • **
  • Posts: 465
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #12 on: October 02, 2015, 05:59:53 pm »
[PRETTY MUCH SOLVED]

Hmmm..... Thanks for the A/B compare newbrain! A couple of interesting things going on...

1. I brought in USART, not UART. But it's surely a small difference, just figured I'd toss that out there.

2. There is definitely something else going on! I didn't realize that when I made the project in Keil, I added their USART API. Which APPEARS to me (who isn't very good at this) to be an API that Keil recommends for USART coding that sticks to CMSIS to allows switching processors. Since that's not a goal here, I'll avoid that from now on.  Attached image.

So.... Surprising and not surprising results for Blinky project header with no actual code :

HAL + Keil's CMSIS USART + No optimization = 35.4k!!!
HAL + Keil's CMSIS USART + Level3 Optimization = 20.2k
HAL + Keil's CMSIS USART + One Elf Section Per Function + No optimize = 9.7k (thanks Yansi!!!)
HAL + No CMSIS USART + No optimization = 5.9k
HAL + No CMSIS USART + + One Elf Section Per Function + Level3 Optimize = 3.1K (only goes to 3.6k when adding CMSIS USART back in)


So it seems that adding the CMSIS layer was a huge issue that with default settings caused a lot of additional functions to be duplicated (I guess!?). Removing CMSIS layer helps a ton. But also turning on One Function Per Elf in Keil's C/C++ options really does it.

I'm fully aware, that as I start to actually use the UART stuff that code will get larger, I'm not dumb, but this is a long stretch from that terrible DMA bloat that didn't need to be there. So inconclusion....

  • Yes, using Keil's default settings is terrible for file size! Since they offer a free-upto-32k version this makes sense I guess
  • Yes, STM's Cube/HAL is garbage
  • No, they aren't even approaching a HAL in the right way
  • No, it is not bug free
  • Yes, I probably should have experimented a little more before just assuming the entire issue was ST's fault  :scared:
  • Thank you everyone! This I hope will be good info for someone in the future.

All that said... What does One Section Per Elf Function do? Because my inital thought was inlining but according to Keil's manual:

One ELF Section per Function
    Generate one ELF section for each function in source file. Output sections are named with the same name as the function that generates the section. Allows you to optimize code or to locate each function on individual memory addresses. Sets the compiler command-line option --split_sections.

That shouldn't be effecting my code size should it? That seems like it's just making the map/elf/objects appear clearer? Right?
« Last Edit: October 02, 2015, 06:04:32 pm by jnz »
 

Offline newbrain

  • Frequent Contributor
  • **
  • Posts: 795
  • Country: se
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #13 on: October 02, 2015, 06:46:38 pm »
Briefly explained, that option will make all the functions independently linkable, as if each had its own .o file.
The linker will happily toss away all the ones that are unused.

As for UART vs. USART, if you are using only UART functions, no need to bring in USART.

I've never used keil CMSIS UART (honestly, I did not know there was such a thing), but HAL should be (much, much) more than enough.

Glad to have been of some help!
Nandemo wa shiranai wa yo, shitteru koto dake.
 

Offline jnz

  • Frequent Contributor
  • **
  • Posts: 465
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #14 on: October 02, 2015, 08:09:17 pm »
Briefly explained, that option will make all the functions independently linkable, as if each had its own .o file.
The linker will happily toss away all the ones that are unused.
As for UART vs. USART, if you are using only UART functions, no need to bring in USART.
I've never used keil CMSIS UART (honestly, I did not know there was such a thing), but HAL should be (much, much) more than enough.
Glad to have been of some help!

Thanks for explanation and help!  I guess my next question is why isn't this the default behavior? And likewise, why would I want the default/unchecked scenario of absolutely stupid size!?

As to CMSIS UART vs HAL. CubeHAL is STM32>STM32, Keil's idea of CMSIS USART is AnyARM>AnyARM. I guess it makes sense if you knew flat out you were going to change between chips weekly. In theory you could do that, but I've found the reality (as most of us seem to agree) when you start getting through X/Y/Z peripheral, or compiler, or RTOS, or library, you just aren't changing chips without re-coding a lot anyhow. It doesn't add a lot of room to use it once the settings are right, but I don't need it for this. It's possible that other KEIL CMSIS APIs that you might need them if working with their RTX RTOS (like their SPI implementation if using external ram/rom maybe). I'm not sure. I'll avoid for now.

Well, now I feel slightly bad. This is only partly ST having a dumb HAL. The rest is Keil's default settings causing massive bloat.
 

Offline MT

  • Super Contributor
  • ***
  • Posts: 1275
  • Country: cn
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #15 on: October 02, 2015, 08:52:18 pm »
Quote
you just aren't changing chips without re-coding a lot anyhow.
Answer is that "clive 1" have joined the dark side!
 

Offline jnz

  • Frequent Contributor
  • **
  • Posts: 465
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #16 on: October 02, 2015, 08:55:22 pm »
Quote
you just aren't changing chips without re-coding a lot anyhow.
Answer is that "clive 1" have joined the dark side!

? Did ST hire him to write bad HAL ?

I've seen his posts around and know he's good for answers usually, but don't get this reference.
 

Offline MT

  • Super Contributor
  • ***
  • Posts: 1275
  • Country: cn
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #17 on: October 02, 2015, 09:04:24 pm »
Yes we could wish he had! But if he would have been fired quite on the spot or he had left ST all by him self awfully quickly!  :-DD
« Last Edit: October 02, 2015, 09:07:33 pm by MT »
 

Offline Yansi

  • Super Contributor
  • ***
  • Posts: 3011
  • Country: 00
  • STM32, STM8, AVR, 8051
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #18 on: October 02, 2015, 10:29:07 pm »
This bloat can be fixed only by not using it. Any decent designer should be able to run the MCU barebone - otherwise he]s not decent.
Someone say, that such is a waste of time to write code almost from scratch - I say the opposite. What do you think? Is it better to sit half a day (usualy much more even for simple tasks) in front of a CubeMX/HAL scratching your had how to make that bastard work properly, or just to invest a few (tens) minutes reading the doc and then implementing it, usualy on the first try? I don't know how about you, but I vote the B option.

I am really not interested in using HAL. The CubeMX idea seems good, but implemented on the poorest lib ever made. The programmers interface is done with the guts out and the performance is also shit, as was already (mayn times) pointed out. I hope many decently experienced designer will agree, that making a project from scratch, helping yourself a little with the STDPeriph or snippets, is nowhere near difficult, even when speaking about complex projects including SDRAM, TFTs, RTOS... It's all about mostly people lazy a unwilling to learn stuff.

And the new LL (low-layer) stuff - I don't really care either. As long as we have STDPeriph available, with all we need - I don't see why not to use it. Isn't ideal, but MUCH better and simplier to use, than the "abstraction" bullshit.
By the way, some time ago a new version of STDPeriph for STM32F4 was released, containing the peripherals also for STM32F446 and 469. So we already have STDPeriph code for the new fancy peripherals like SPDIF-RX, QSPI and so. Porting this to STM32F7 seems right possible!
...really dunno, why the ST is so maniacally refuses to make STDP lib for the F7 line. They have the code already written! The new peripherals on F446/469 are the same (or should be as my best knowledge goes) as on F7... Only some other M7 core related stuff have to be made from scratch.

If I would have enough spare time, I would try to port the STDP to F7 myself. Or anyone willing to contribute to the guerilla version of F7 STDP Library? This would be much more useful work than swearing around about Cube/HAL again and again.
 

Offline obiwanjacobi

  • Frequent Contributor
  • **
  • Posts: 968
  • Country: nl
  • What's this yippee-yayoh pin you talk about!?
    • Marctronix Blog
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #19 on: October 04, 2015, 05:35:27 am »
I see a lot of these libraries use plain C - is there a reason for that? Especially on the larger MCUs a good C++ library would not present any significant overhead and could be much more intuitive and easy to use...?

I am only a hobbyist and not very experienced in reading MCU datasheets but it takes me a lot longer than 10's of minutes to flesh out a Usart on an AVR - spent most of the day trying to understand all the details and coming up with a good structure to represent it in my library. I guess practice makes perfect here but still.

I hear/see a lot of complaints about these libraries shipped by manufacturers, what would you say would be an example of a good library?
Arduino Template Library | Zalt Z80 Computer
Wrong code should not compile!
 

Offline Yansi

  • Super Contributor
  • ***
  • Posts: 3011
  • Country: 00
  • STM32, STM8, AVR, 8051
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #20 on: October 04, 2015, 10:49:12 am »
If you can't figure out UART on AVR, why the hell on earth would it help you to have a library in C++? 

Leave that C++ rubbish for PC bloatcode programmers.

By the way, whats the point of the common thought, that the bigger the MCU the bigger the bullshit overhead code it will handle? This whole idea is why we have today a 8 core processor to make a call with your phone. And still it's slow as a whack and has mostly none useful functionality, except very sophisticated espionage tools.

Instead of taking the advantage of increasing computational power of today's MCUs, most people simply throw it away, because of sillines and unwillingness to do anything properly - because they'd need to stick their nose into literature, read and learn. This phenomena is partially understandable at corporate level, where there is no room for you to do thigs properly even if you wanted to do so... but it's rather sad that most hobbyists and electronic enthusiasts are just throwing their chance away too.

You never can understand how helping it is to learn something, before you actually learn it.  :)

There are so many things todays MCUs can handle, if you will not throw away computational power away with bad coding practice...
See this - my favourite - AVR mega644 a little overclocked (32MHz) does play S3M from a CF/HDD IDE through an I2S Stereo DAC.



//EDIT: I forgot to answer you last question! Doh. I don't want to pick an example. No library can be good for all applications. Every application needs are specific. It is up to the programmer to choose, whatever will fit the application the best.

On STM32 I personally prefer to combine STDPeriph library with direct register access. The have registering like initializing peripherals - one time called code or so - I leave it for the library. But checking flags, setting flags - doing mostly manually in registers. (Why would I use a bloaty function to check a single flag, if I know the register and bit just right out of my head?)
Sure, there are applications where not much code optimizations are needed, but I rather like a more optimized code. Some would say that such code is not portable. It is maybe not, but I don't care! You write optimized code or portable code. These two don't mix well. Mostly I don't need code portability, so I choose the optimized.

By the way, porting code in between STM32 chips is not a big deal, as the chips are designed with stunningly good compatibility in between them. (Unlike Atmel AVRs... where there are no two similar parts.. Doh.)
« Last Edit: October 04, 2015, 11:34:52 am by Yansi »
 

Offline obiwanjacobi

  • Frequent Contributor
  • **
  • Posts: 968
  • Country: nl
  • What's this yippee-yayoh pin you talk about!?
    • Marctronix Blog
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #21 on: October 04, 2015, 11:28:20 am »
If you can't figure out UART on AVR, why the hell on earth would it help you to have a library in C++? 

Perhaps you misunderstood. My remark was a response to the 10-minute statement made earlier. I meant to say that that estimate might be a little optimistic. Perhaps I am very slow, it does take me some time and a couple of re-reads to fully grasp the way they've organized it.

Also I think calling methods with good names is a LOT more readable than poking bits in and out of abstractly named memory locations. Details in the datasheet can be taken care of by the library (up to a certain level). The idea would be to create an API that is intuitive enough to get 80% of the code going in 20% of the effort.

Leave that C++ rubbish for PC bloatcode programmers.
Yes, I have come to realize that most coders of embedded systems are a little allergic to C++. For me it is the way I think about software problems - in objects. And it is true that adding layers of abstraction will induce extra cost on hardware resources - but it also *should* make development faster, easier and have less bugs. Usually dev-time is more expensive than a bigger chip... YMMV.
Arduino Template Library | Zalt Z80 Computer
Wrong code should not compile!
 

Offline Karel

  • Super Contributor
  • ***
  • Posts: 1434
  • Country: 00
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #22 on: October 05, 2015, 10:19:58 am »
If you can't figure out UART on AVR, why the hell on earth would it help you to have a library in C++? 

Leave that C++ rubbish for PC bloatcode programmers.

By the way, whats the point of the common thought, that the bigger the MCU the bigger the bullshit overhead code it will handle? This whole idea is why we have today a 8 core processor to make a call with your phone. And still it's slow as a whack and has mostly none useful functionality, except very sophisticated espionage tools.

Instead of taking the advantage of increasing computational power of today's MCUs, most people simply throw it away, because of sillines and unwillingness to do anything properly - because they'd need to stick their nose into literature, read and learn. This phenomena is partially understandable at corporate level, where there is no room for you to do thigs properly even if you wanted to do so... but it's rather sad that most hobbyists and electronic enthusiasts are just throwing their chance away too.

I couldn't have said it better and I fully agree.
 

Offline Karel

  • Super Contributor
  • ***
  • Posts: 1434
  • Country: 00
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #23 on: October 05, 2015, 10:24:04 am »
Yes, I have come to realize that most coders of embedded systems are a little allergic to C++.

Often, for good  reasons.

Anyway, it's perfectly possible to write object oriented code in C.
 

Offline obiwanjacobi

  • Frequent Contributor
  • **
  • Posts: 968
  • Country: nl
  • What's this yippee-yayoh pin you talk about!?
    • Marctronix Blog
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #24 on: October 05, 2015, 10:39:34 am »
Yes, I have come to realize that most coders of embedded systems are a little allergic to C++.

Often, for good  reasons.

Anyway, it's perfectly possible to write object oriented code in C.

I rest my case...   :-DD
Arduino Template Library | Zalt Z80 Computer
Wrong code should not compile!
 

Offline richardman

  • Frequent Contributor
  • **
  • Posts: 427
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #25 on: October 05, 2015, 10:44:15 am »
An example or many examples of badly written HAL API do not mean that all HAL API have to be bad  ;D

ST's API is very low level. Most of them boil down to instead of writing a line or two in C that modifies an I/O register, you get to write it as a function call! Oh Joy! Talk about pretty useless.

At least there are some useful ones such as the ones that calculate the baudrate etc.

And I am not a fan of creating "init" struct just so that a user gets to learn another struct, so that it can be used to initialize the low level I/O register "structure" - just make the API takes (sensible) arguments and be done with it.

CubeMX could be nice for initialization code but does nothing about writing code afterward.

   
// richard http://imagecraft.com/
JumpStart C++ for Cortex (compiler/IDE/debugger): the fastest easiest way to get productive on Cortex-M.
Smart.IO: phone App for embedded systems with no app or wireless coding
 

Offline Karel

  • Super Contributor
  • ***
  • Posts: 1434
  • Country: 00
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #26 on: October 05, 2015, 02:52:30 pm »
Usually dev-time is more expensive than a bigger chip... YMMV.

In a professional environment, this is usually not the case. At least not in case your design is going to be produced in large quantities...

It's another story when you work in an academic or research environment.

 

Offline richardman

  • Frequent Contributor
  • **
  • Posts: 427
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #27 on: October 05, 2015, 08:10:51 pm »
These are some of the reasons we decided to use a C++ Style / "C with Classes" for our JumpStart API. The separate name space inherent with objects mean that we can use similar or same names for different objects, without arbitrarily adding a module prefix or suffix. It makes the program easier to read and maintain.

As for full C++, 10 years ago, C++ was suppose to "take over" the embedded world, however the reality is that most embedded engineers are hardware folks, even for the 32-bit chips with more memory. According to embedded.com survey, C users remain at more or less 50% of the embedded programmer population, and C++ users indeed has not increased after the initial ramp and in fact has dropped slightly in the last few years.

I am sure studies can be made to discover all the reasons why, but in this case, the users have spoken.

...

Also I think calling methods with good names is a LOT more readable than poking bits in and out of abstractly named memory locations. Details in the datasheet can be taken care of by the library (up to a certain level). The idea would be to create an API that is intuitive enough to get 80% of the code going in 20% of the effort.

Leave that C++ rubbish for PC bloatcode programmers.
Yes, I have come to realize that most coders of embedded systems are a little allergic to C++. For me it is the way I think about software problems - in objects. And it is true that adding layers of abstraction will induce extra cost on hardware resources - but it also *should* make development faster, easier and have less bugs. Usually dev-time is more expensive than a bigger chip... YMMV.
// richard http://imagecraft.com/
JumpStart C++ for Cortex (compiler/IDE/debugger): the fastest easiest way to get productive on Cortex-M.
Smart.IO: phone App for embedded systems with no app or wireless coding
 

Offline targ2002

  • Contributor
  • Posts: 11
  • Country: gb
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #28 on: October 05, 2015, 08:51:35 pm »
These are some of the reasons we decided to use a C++ Style / "C with Classes" for our JumpStart API. The separate name space inherent with objects mean that we can use similar or same names for different objects, without arbitrarily adding a module prefix or suffix. It makes the program easier to read and maintain.

As for full C++, 10 years ago, C++ was suppose to "take over" the embedded world, however the reality is that most embedded engineers are hardware folks, even for the 32-bit chips with more memory. According to embedded.com survey, C users remain at more or less 50% of the embedded programmer population, and C++ users indeed has not increased after the initial ramp and in fact has dropped slightly in the last few years.

I am sure studies can be made to discover all the reasons why, but in this case, the users have spoken.

...

Also I think calling methods with good names is a LOT more readable than poking bits in and out of abstractly named memory locations. Details in the datasheet can be taken care of by the library (up to a certain level). The idea would be to create an API that is intuitive enough to get 80% of the code going in 20% of the effort.

Leave that C++ rubbish for PC bloatcode programmers.
Yes, I have come to realize that most coders of embedded systems are a little allergic to C++. For me it is the way I think about software problems - in objects. And it is true that adding layers of abstraction will induce extra cost on hardware resources - but it also *should* make development faster, easier and have less bugs. Usually dev-time is more expensive than a bigger chip... YMMV.
I think that the reason a lot embedded sw devs don't use c++ is that it is harder to prove that there are no software bugs. This is especially important where you are talking about safety critical code or where it is difficult to upgrade the sw on the device.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 3123
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #29 on: October 06, 2015, 01:31:17 am »
Quote
As for UART vs. USART, if you are using only UART functions, no need to bring in USART.
IIRC, ST ARMs have two separate peripherals at the hardware level (UART and USART; they might require different APIs (although, glancing quickly at the STM32f1xx manual, they have identical register definitions, so perhaps it's just a pinmux difference?)


Quote
Leave that C++ rubbish for PC bloatcode programmers.
The thing is, there are some C++ features  that would REALLY HELP this sort of code, and they're NOT ones that cause bloat.  In particular:
1) Function Overloading
    Function overloading allows multiple functions to be defined with the same name, as long as they have different parameters.
    This means that instead of having a function "USART_init(struct uart_config_params *confg);", where uart_config_params contains all of the parameters needed to initialize the USART in any possible mode supported by the hardware (IRDA, LIN, RS485, synchronous, async, etc) and then having USART__init() have a giant case statement handling all those possible cases, a HAL library could define multiple smaller functions:
Code: [Select]
   USART_init(struct async_usart_config_params *config)
   USART_init(struct irda_usart_config_params *config)
   USART_init(struct lin_usart_config_params *config)
   USART_init(struct rs485_usart_config_params *config)
   USART_init(struct sync_usart_config_params *config)
Each one of those wouldn't have to contain code that implemented the others, and they could still share common code.  (Now, it turns out that the STM32 usart code isn't so bad, but the above example of lumped features is from Atmel's ASF, so it's still a real problem.)  Runtime cost: zero.

2) Templates and Template Meta Programming
As I understand it, this is essentially the pre-processing language that C should have had years ago, rather than sticking to the "all the preprocessor does is text substitution" crock that is what we have.  With a bit of effort, templates can give you the sort of "optimize away all these things because their values are constant an known at runtime" that would require a really ugly combination of C preprocessing and non-standard inlining features, without being ugly, or non-standard.  Runtime cost: zero (less than zero, really, since it will generally replace run-time decisions with compile-time decisions.)

Now, the big danger (IMO) is that C++ programmers DO come from a desktop/server environment, and don't tend to be very aware of when they are or are not introducing extra overhead.  You're likely to get "why wouldn't I use an STL container for my FIFO?" (because that implementation assumes and uses dynamic memory allocation!) and similar. 
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 6616
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #30 on: October 06, 2015, 03:22:13 am »
2) Templates and Template Meta Programming
As I understand it, this is essentially the pre-processing language that C should have had years ago, rather than sticking to the "all the preprocessor does is text substitution" crock that is what we have.  With a bit of effort, templates can give you the sort of "optimize away all these things because their values are constant an known at runtime" that would require a really ugly combination of C preprocessing and non-standard inlining features, without being ugly, or non-standard.  Runtime cost: zero (less than zero, really, since it will generally replace run-time decisions with compile-time decisions.)
Actually, templates are a known bloater since they let you easily generate huge amounts of nearly-identical code very quickly.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 3123
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #31 on: October 06, 2015, 05:56:00 am »
Quote
templates are a known bloater since they let you easily generate huge amounts of nearly-identical code very quickly.
Differently than the combination of macros and inlines that I was claiming it would replace?
yes, I supposed that you have to be careful how you use them; but they're not inherently bloaty...
 

Offline Karel

  • Super Contributor
  • ***
  • Posts: 1434
  • Country: 00
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #32 on: October 06, 2015, 06:20:32 am »
A bit outdated but still (mostly) valid:

http://harmful.cat-v.org/software/c++/linus

 

Offline richardman

  • Frequent Contributor
  • **
  • Posts: 427
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #33 on: October 06, 2015, 06:54:18 am »
Ah, good ol' Mr. Torvalds

"*YOU* are full of bullshit..." ha ha, he holds no punches.
// richard http://imagecraft.com/
JumpStart C++ for Cortex (compiler/IDE/debugger): the fastest easiest way to get productive on Cortex-M.
Smart.IO: phone App for embedded systems with no app or wireless coding
 

Offline mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 12086
  • Country: gb
    • Mike's Electric Stuff
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #34 on: October 06, 2015, 07:19:14 am »
I'm amazed at how many toolchains, don't, by default, omit any unused functions. It's a total no-brainer for producing efficient code
There are occasions where you need to force inclusion of apparently un-called functions, but this should be done explicitly, not by default.

For something as simple as a UART, it will usually be quicker and easier to code it form scratch rather than trying to learn, understand (and if you're unlucky, debug) a HAL library which will inevitably include functionality you don't need. 

The only time I've looked at CMSIS, it was on a NXP part, and it contained a ridiculous amount of code (including a big lookup table) to set up peripherals & clocks from compile-time constants using run-time calculations instead of compile-time as it should have been.
And it had a nasty PLL setup bug that made it sometimes hang on startup.
 
An issue that stands in the way of doing things optimally here is that C's preprocessor is rather limited in doing the sort of things you sometimes need to do this sort of thing properly, like loops.

HALs are useful for complex stuff like filesystems, USB, networking etc. but to use them by default for simple functionality like IO setup, UARTS etc. is often going to be a really bad idea. How often does anyone really port code to different processors? And on those occasions what are the chances that the HALs will still be compatible?

And consider who will be writing those HALs at a device manufacturer - probably one step up from the poor sucker/intern who's tasked with writing device-specific header files.

Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline richardman

  • Frequent Contributor
  • **
  • Posts: 427
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #35 on: October 06, 2015, 08:36:08 am »
I don't want to make it sound like a commercial, but this is how you initialize the UART using JumpStart API:

    usart2.SetPins(&porta, 2, 1, &porta, 3, 1);
    usart2.MakeUSART(9600, 8, 1, true);

    printf("\r\nImageCraft JumpStart MicroBox... System running at %dMhz\n", jsapi_clock.GetSysClkFreq() / 1000000);

...And printf works. "SetPins" specifies the pins and the ST alternate function codes. MakeUSART specifies the UART protocol parameters.

I will be adding ring buffer interrupt driven support in the next week or so.
// richard http://imagecraft.com/
JumpStart C++ for Cortex (compiler/IDE/debugger): the fastest easiest way to get productive on Cortex-M.
Smart.IO: phone App for embedded systems with no app or wireless coding
 

Offline jnz

  • Frequent Contributor
  • **
  • Posts: 465
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #36 on: October 06, 2015, 02:44:13 pm »
I'm amazed at how many toolchains, don't, by default, omit any unused functions. It's a total no-brainer for producing efficient code
There are occasions where you need to force inclusion of apparently un-called functions, but this should be done explicitly, not by default.

For something as simple as a UART, it will usually be quicker and easier to code it form scratch rather than trying to learn, understand (and if you're unlucky, debug) a HAL library which will inevitably include functionality you don't need. 

The only time I've looked at CMSIS, it was on a NXP part, and it contained a ridiculous amount of code (including a big lookup table) to set up peripherals & clocks from compile-time constants using run-time calculations instead of compile-time as it should have been.
And it had a nasty PLL setup bug that made it sometimes hang on startup.
 
An issue that stands in the way of doing things optimally here is that C's preprocessor is rather limited in doing the sort of things you sometimes need to do this sort of thing properly, like loops.

HALs are useful for complex stuff like filesystems, USB, networking etc. but to use them by default for simple functionality like IO setup, UARTS etc. is often going to be a really bad idea. How often does anyone really port code to different processors? And on those occasions what are the chances that the HALs will still be compatible?

And consider who will be writing those HALs at a device manufacturer - probably one step up from the poor sucker/intern who's tasked with writing device-specific header files.

Good points and thank you for staying on topic!
 

Offline mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 12086
  • Country: gb
    • Mike's Electric Stuff
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #37 on: October 06, 2015, 04:14:24 pm »
I don't want to make it sound like a commercial, but this is how you initialize the UART using JumpStart API:

    usart2.SetPins(&porta, 2, 1, &porta, 3, 1);
    usart2.MakeUSART(9600, 8, 1, true);

    printf("\r\nImageCraft JumpStart MicroBox... System running at %dMhz\n", jsapi_clock.GetSysClkFreq() / 1000000);

...And printf works. "SetPins" specifies the pins and the ST alternate function codes. MakeUSART specifies the UART protocol parameters.

I will be adding ring buffer interrupt driven support in the next week or so.
OK now make it run at 4Mbaud, do on-the-fly ID filtering, packet framing and receive double-buffering.
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline Bassman59

  • Super Contributor
  • ***
  • Posts: 1291
  • Country: us
  • Yes, I do this for a living
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #38 on: October 06, 2015, 05:01:59 pm »
Ah, good ol' Mr. Torvalds

"*YOU* are full of bullshit..." ha ha, he holds no punches.

And he's completely correct about Boost, which remains shite.
 

Offline richardman

  • Frequent Contributor
  • **
  • Posts: 427
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #39 on: October 06, 2015, 06:54:04 pm »

OK now make it run at 4Mbaud, do on-the-fly ID filtering, packet framing and receive double-buffering.

Dear Mike, our goal for the design (of most of our tools) is mainly "what do 95% of the users need/want 95% of the time?" I will do some research and see which parts of your suggestions fall in those categories.

Thank you for your comments.
// richard http://imagecraft.com/
JumpStart C++ for Cortex (compiler/IDE/debugger): the fastest easiest way to get productive on Cortex-M.
Smart.IO: phone App for embedded systems with no app or wireless coding
 

Offline ralphd

  • Frequent Contributor
  • **
  • Posts: 442
  • Country: ca
    • Nerd Ralph
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #40 on: October 07, 2015, 04:08:52 am »
I agree with Bill that function overloading is a good thing.
I disagree on templates.  While some people can do some cool things with templates, you need to be a *really* good programmer to do it well given the complexity of templates.
As for compile time constant evaluation, I've said before that lto solves most of those problems.  Since gcc 4.9.2 lto works very well.  Adding -flto to your makefile is a *lot* simpler than writing a fancy constexpr template function...
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline richardman

  • Frequent Contributor
  • **
  • Posts: 427
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #41 on: October 07, 2015, 05:30:25 am »
I agree with Bill that function overloading is a good thing.
I disagree on templates.  While some people can do some cool things with templates, you need to be a *really* good programmer to do it well given the complexity of templates.
As for compile time constant evaluation, I've said before that lto solves most of those problems.  Since gcc 4.9.2 lto works very well.  Adding -flto to your makefile is a *lot* simpler than writing a fancy constexpr template function...

I am leaning on adding function overloading to our compilers for the same reason that I think it's a good thing also. Last year we added interspersed declarations and statements, anonymous struct/union, and "C with Classes" and they are really useful, and I expect function overload is the same.

Templates...hmmm... if you really want a macro processor done right, look at BLISS, it's even better than the venerable m4. The code generators for the Digital GEM compilers for VAX-11, Alpha, MIPS are basically written in BLISS macros that are compiled 3 times. Remember that at that time (1992), the fastest workstation was the 66 MHz HP PA-RISC "Snake" and then the DEC Alpha EV4 came out at a mind boggling 200 MHz, and the compilers were a major factor in how they achieved the overall performance. (I happened to have worked on both the DEC GEM and HP PA-RISC compilers, not extensively as I came in late to both projects, but it's very interesting to see the differences in how the two teams interact and how they approach the problems).

Anyway, so yes, templates and boost give me a headache :-)

Lastly, link time optimization is a good thing. Good to see that got put into a production compiler.
// richard http://imagecraft.com/
JumpStart C++ for Cortex (compiler/IDE/debugger): the fastest easiest way to get productive on Cortex-M.
Smart.IO: phone App for embedded systems with no app or wireless coding
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 3123
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #42 on: October 07, 2015, 05:53:50 am »
Quote
if you really want a macro processor done right, look at BLISS
Many assemblers have wonderful macro processing capabilities.  The gnu assembler isn't one of them :-(

I'm not sure that the sort of template programming that would be needed to clean up the HAL-like libraries would be that complex.
And presumably, the person WRITING the library would (or could, anyway) be pretty good.  (not that the current libraries give me a lot of confidence that that is currently true.  Sigh.)  It doesn't have to be fully optimal; just "much better" than the current code.   That wouldn't be very hard...

ONE of the things you notice when you delve into vendor libraries is that they're carefully written not to have a lot of compiler dependencies.  It's *necessary* that the same libraries work for Keil, IAR, GCC, and others.  Letting the source get sloppy and assuming that gcc link-time-optimization will fix everything up is NOT acceptable.  (Of course, this also ends up being a problem with using C++; you don't want to piss off compiler "partner" companies who don't have C++ compilers yet (or, for that matter, customers who don't want to pay for or deal with complexities of C++, just to write C programs.)

And I don't think lto can fix some of the poor APIs that vendors are using.  I mean, ASF at least is full of:
Code: [Select]
get_thing_defaults(&init_srtuct);
init_struct.somethingINeedToChange = WEIRDCONSTANT;
init_struct.somethingElse = OTHERCONSTANT;
thing_init(&init_struct);


 

Offline richardman

  • Frequent Contributor
  • **
  • Posts: 427
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #43 on: October 07, 2015, 06:23:49 am »
...
And I don't think lto can fix some of the poor APIs that vendors are using.  I mean, ASF at least is full of:
Code: [Select]
get_thing_defaults(&init_srtuct);
init_struct.somethingINeedToChange = WEIRDCONSTANT;
init_struct.somethingElse = OTHERCONSTANT;
thing_init(&init_struct);

ASF does that TOO? ST's lib is full of this.... *let me channel Linus Torvalds here* ah, inefficiency. WTF are these people thinking?! How could things like these considered good?
// richard http://imagecraft.com/
JumpStart C++ for Cortex (compiler/IDE/debugger): the fastest easiest way to get productive on Cortex-M.
Smart.IO: phone App for embedded systems with no app or wireless coding
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 6616
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #44 on: October 07, 2015, 07:52:54 am »
ASF does that TOO? ST's lib is full of this.... *let me channel Linus Torvalds here* ah, inefficiency. WTF are these people thinking?! How could things like these considered good?
Written by those whose only experience with "efficiency" is in academic theory and/or desktop/server environments where "buy more hardware" is the prevailing line of thought and being apathetic towards optimisation is considered a virtue. They're spoiled brats. :P
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 3123
  • Country: us
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #45 on: October 07, 2015, 08:45:34 am »
Quote
init_struct.somethingINeedToChange = WEIRDCONSTANT;
Indeed.  I should also note that the size of the init_struct is typically significantly larger than the size of hardware peripheral registers actually involved with configuring "thing."  (eg sizeof(SERCOM0) == 52 (including ~20bytes of "reserved" space).   sizeof(struct usart_config) == 64.   Sigh.)  This is the sort of thing that gets one particularly enthusiastic about calculating things at compile-time.
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 3393
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #46 on: October 07, 2015, 08:56:34 am »
I actually like it. You just create a flash array of those structs to configure the pins.
Easy to read, easy to switch, easy to diff.
Yes it takes a bit flash. But it would take an equal (if not larger) amount of flash if you write all the register configure lines separate.
 

Offline bingo600

  • Super Contributor
  • ***
  • Posts: 1423
  • Country: dk
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #47 on: October 07, 2015, 08:32:29 pm »
I gave up on ST's libs, as i had to write more lines there, than i have to in VHDL
And a v3 example was absolutely NOT compatible with a v4 example.

So I switched to libopencm3, and i'm quite happy with it.
I have only just begun, but have a USB-CDC example up & running for a F103, only taking up 6k.

You need to dig around the examples, the API Doc (Doxygen ... sigh) , and examples from the web.
But it's easier to cope with , and generates efficient code.

/Bingo
 

Offline jnz

  • Frequent Contributor
  • **
  • Posts: 465
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #48 on: October 07, 2015, 08:52:04 pm »
I gave up on ST's libs, as i had to write more lines there, than i have to in VHDL
And a v3 example was absolutely NOT compatible with a v4 example.

So I switched to libopencm3, and i'm quite happy with it.
I have only just begun, but have a USB-CDC example up & running for a F103, only taking up 6k.

You need to dig around the examples, the API Doc (Doxygen ... sigh) , and examples from the web.
But it's easier to cope with , and generates efficient code.

/Bingo

I was under the impression that project was entirely dead. I'm extremely reluctant to base a production design off of dead open source.
 

Offline bingo600

  • Super Contributor
  • ***
  • Posts: 1423
  • Country: dk
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #49 on: October 07, 2015, 08:58:12 pm »
I gave up on ST's libs, as i had to write more lines there, than i have to in VHDL
And a v3 example was absolutely NOT compatible with a v4 example.

So I switched to libopencm3, and i'm quite happy with it.
I have only just begun, but have a USB-CDC example up & running for a F103, only taking up 6k.

You need to dig around the examples, the API Doc (Doxygen ... sigh) , and examples from the web.
But it's easier to cope with , and generates efficient code.

/Bingo

I was under the impression that project was entirely dead. I'm extremely reluctant to base a production design off of dead open source.

Well there are recent commits
https://github.com/libopencm3/libopencm3

But i primarily see it as some "helper routines" for USB etc, the rest i'll prob do on the bare metal

/Bingo
 

Offline ralphd

  • Frequent Contributor
  • **
  • Posts: 442
  • Country: ca
    • Nerd Ralph
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #50 on: October 07, 2015, 09:43:45 pm »
I actually like it. You just create a flash array of those structs to configure the pins.
Easy to read, easy to switch, easy to diff.
Yes it takes a bit flash. But it would take an equal (if not larger) amount of flash if you write all the register configure lines separate.
But usually only a few of the registers need to be written.
Unthinking respect for authority is the greatest enemy of truth. Einstein
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 3393
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: STM32's HAL... Anything I can do to fix this outrageous bloat?! (pic)
« Reply #51 on: October 08, 2015, 05:48:41 am »
That depends if you want to set unused pins into a more energy efficient state.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf