Author Topic: C struct bitfields - size versus speed  (Read 12460 times)

0 Members and 1 Guest are viewing this topic.

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1650
  • Country: gb
C struct bitfields - size versus speed
« on: April 30, 2018, 10:07:43 pm »
I never really thought about this until today, but when using bitfields there must be some kind of trade-off between memory size and execution speed, right?

Say you have an struct like the following:

Code: [Select]
typedef struct {
    uint8_t foo;
    uint16_t bar;
    uint8_t baz : 3;
    bool flag_a : 1;
    bool flag_b : 1;
    bool flag_c : 1;
    bool flag_d : 1;
    bool flag_e : 1;
} my_struct_t;

Now, by setting baz to 3 bits (assuming it only needs to represent values 0-7) and using single-bit fields for our 5 boolean flags, we've gone from a total size of 9 bytes to only 4 bytes. Nice, some memory saved. However, when you're writing your code to access these bitfields, what kind of machine code is the compiler actually generating?

I assume there must be some penalty to reads and writes to these fields, in the form of extra instructions necessary to 'resize' the data. Without knowing what actually goes on, I would assume the compiler has to generate masking and shifting instructions. For instance, in the case of assigning some other uint8_t variable to baz, I would assume you would need to mask off the relevant three bits, and depending on the alignment within the struct, shift those 3 bits into the correct position.

Am I right? Or, because the C standards don't actually define how struct bitfields are physically packed, is this all platform- and/or compiler-dependant?
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 28736
  • Country: nl
    • NCT Developments
Re: C struct bitfields - size versus speed
« Reply #1 on: April 30, 2018, 10:13:30 pm »
The compiler will create bit shifts & masks to get and store the bits from the struct so yes it will be a lot slower. I have used this once or twice in the past decades to safe memory space.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline helius

  • Super Contributor
  • ***
  • Posts: 3696
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #2 on: April 30, 2018, 10:19:22 pm »
However, when you're writing your code to access these bitfields, what kind of machine code is the compiler actually generating?

I assume there must be some penalty to reads and writes to these fields, in the form of extra instructions necessary to 'resize' the data. Without knowing what actually goes on, I would assume the compiler has to generate masking and shifting instructions. For instance, in the case of assigning some other uint8_t variable to baz, I would assume you would need to mask off the relevant three bits, and depending on the alignment within the struct, shift those 3 bits into the correct position.
If the assigned value is a constant, the compiler can use a pre-shifted literal. The only extra instructions are a mask to preserve the remaining bits. If you assign multiple fields, it can coalesce multiple writes into one, and that saves more time than the mask instructions take. Keeping the struct size small also saves time when initializing it and passing it in function arguments.

Quote
Am I right? Or, because the C standards don't actually define how struct bitfields are physically packed, is this all platform- and/or compiler-dependant?
Of course struct packing is compiler dependent; it goes far beyond just bitfields.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9987
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #3 on: April 30, 2018, 10:22:36 pm »
Why not compile some code accessing the various field and look at the assembly file?  Use the -S option on the gcc command line.  Assuming gcc, of course.

Change all of the variables to unsigned char and see how that works out.

Definitely put some bit extract/test expressions inside an 'if' statement like if (flag_a) { ... }

Yes, it's going to be compiler dependent and maybe even platform dependent.

It will be worth the time for the education.

Unless you are working on a small micro where flash is large but ram is small, why bother?

Oh, and be aware of endianness, in some cases the bits will be layed out 'backwards' so when you write masking operations, you have a 50-50 chance of having everything wrong.
I played with this a couple of weeks ago and I had to do a #ifdef to test the endianness.

 

Offline C

  • Super Contributor
  • ***
  • Posts: 1346
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #4 on: April 30, 2018, 10:35:30 pm »

Really in to how smart the compiler is.
Some things in a language force a compiler to be smarter.

A dumb compiler might treat each bit as a byte.

A little smarter and compiler might do shifts and masks.

A really smart compiler would look at where a range of bits exist.
a switch statement then is not as a 0-7 value
but as the actual storage value range.
You program switch statement as 0-7 and compiler translates it to storage position values for same result with less instructions.

When compiler gets this smart, it can skip a lot of shifts & rotates which gives faster smaller code result.

C
 

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1650
  • Country: gb
Re: C struct bitfields - size versus speed
« Reply #5 on: April 30, 2018, 10:49:24 pm »
The compiler will create bit shifts & masks to get and store the bits from the struct so yes it will be a lot slower.

When you say "a lot slower", are you exaggerating for effect, or do you mean that?

Why not compile some code accessing the various field and look at the assembly file?  Use the -S option on the gcc command line.  Assuming gcc, of course.

I'm not really acquainted with assembly language. Much of it is just gibberish to me. :P All I'd be able to conclude with any certainty is that the code is shorter or longer. I've previously found it very hard to look at the -S output from GCC, as the annotation of the assembly with the original source code is all over the place and lines frequently don't seem to bear any relation to the section of assembly code they're placed adjacent to. :(

Unless you are working on a small micro where flash is large but ram is small, why bother?

The situation that has got me thinking about all this is that I have recently been writing some code for exactly that kind of small micro with little RAM. :) The code takes incoming serial data and throws it into a FIFO queue to be processed - some classification, filtering, etc. I am using bitfields for my queue structure in order to save a lot of memory - the difference between taking half the available RAM and only a quarter. But, I need this code to be super speedy, as my program's main loop will be doing a lot of other stuff, so I need the queue-processing code to take as little time as possible.

In my mind, at the moment, the saving on memory outweighs any potential execution speed penalty. But, if it later turns out I'm marginal on being able to adequately 'keep up' with the incoming data stream because my main loop is doing too much other stuff, I'm worried I may have shot myself in the foot by giving myself a performance penalty by using bitfields.
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 12609
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #6 on: April 30, 2018, 11:05:12 pm »
If you write a test like this:

Code: [Select]
if (my_struct.flag_b) { /* do something */ }
Then I don't see why the generated code should be slow at all. It's a simple test of a bit using a bit mask.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9987
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #7 on: April 30, 2018, 11:30:07 pm »
Here's a little GCC example of endianness and bitfields.

Code: [Select]
#include <stdio.h>

union my_register{
#if __BIG_ENDIAN
struct
{
unsigned short Right : 1;
unsigned short Mid   : 2;
unsigned short Left  : 5;
};
#else
struct
{
unsigned short Left  : 5;
unsigned short Mid   : 2;
unsigned short Right : 1;
};
#endif
unsigned char value;
};

int main(void)
{
union my_register reg;

reg.value = (unsigned char) 0b10001101; // 0b10001_10_1

printf("%#2x %#1x %#1x\n",reg.Left, reg.Mid, reg.Right);
return 0;
}


As long as you access the bitfields with a fully qualified name rather than masking against a literal like 0b10101101 you don't have to consider endianness or anything else.  The compiler will make it work.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 5068
  • Country: nz
Re: C struct bitfields - size versus speed
« Reply #8 on: May 01, 2018, 04:35:47 am »
I never really thought about this until today, but when using bitfields there must be some kind of trade-off between memory size and execution speed, right?

Say you have an struct like the following:

Code: [Select]
typedef struct {
    uint8_t foo;
    uint16_t bar;
    uint8_t baz : 3;
    bool flag_a : 1;
    bool flag_b : 1;
    bool flag_c : 1;
    bool flag_d : 1;
    bool flag_e : 1;
} my_struct_t;

Now, by setting baz to 3 bits (assuming it only needs to represent values 0-7) and using single-bit fields for our 5 boolean flags, we've gone from a total size of 9 bytes to only 4 bytes. Nice, some memory saved. However, when you're writing your code to access these bitfields, what kind of machine code is the compiler actually generating?

I assume there must be some penalty to reads and writes to these fields, in the form of extra instructions necessary to 'resize' the data. Without knowing what actually goes on, I would assume the compiler has to generate masking and shifting instructions. For instance, in the case of assigning some other uint8_t variable to baz, I would assume you would need to mask off the relevant three bits, and depending on the alignment within the struct, shift those 3 bits into the correct position.

Am I right? Or, because the C standards don't actually define how struct bitfields are physically packed, is this all platform- and/or compiler-dependant?

Exactly what instructions the compiler has to generate depend on what instructions the CPU type has available :-)

At one extreme, things such as the 68020 could do that in a single instruction: BFINS Dn, <ea>, [offset:width].  The VAX might well have been able to as well, though I can't recall. It's not necessarily FAST though .. in the case of the 68020 BINFS from one register to another takes 10 clock cycles if the instruction is in the cache and it can't be pipelined with other instructions (the most common case).

Simple instructions such as add, sub, and, or, xor take 2 clock cycles. Shifts by static amounts take 4 clock cycles. So you can do several simple operations in the time taken by one BFINS.

At the other extreme, you might not have any support for bit fields at all, in which case you have to do everything with shifts or rotates and masking.  If you at least have rotate (the RISC-V base instruction set doesn't) then you can do this:

ROR t1, rs1, #off
ASL t2, rs3, #32-size
LSR t1, t1, #size
OR rd, t1, t2
ROL rd, rd, #size+off

That will take five clock cycles on most RISC CPUs, or four if the first two instruction can be executed together.

In between, you can have instructions that don't actually do the bitfield insert directly, but just make it easier. For example on the Motorola M88000:

CLR t1, rs1, size<off>   //  clears the destination bitfield
MAK t2, rs2, size<off>  // shifts the inserted value to the right place and clears all other bits
OR rd, t1, t2

That can be done in two clock cycles because the first two instructions can be executed at the same time.

If you need to read the stuff from RAM and write it back afterwards then you'd probably never notice the speed difference between any of these!
« Last Edit: May 01, 2018, 04:37:21 am by brucehoult »
 

Offline bson

  • Supporter
  • ****
  • Posts: 2562
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #9 on: May 01, 2018, 04:46:36 am »
I'd be more concerned with the unaligned uint16_t; unless you explicitly flag the structure as packed you may find the compiler inserts a pad byte.  Or more bizarre failures; for example Cortex-M0/M0+ will hard fault on unaligned accesses while M3/M4/M7 just handle them transparently.  You may find your code works fine on an M4, then you recompile it for M0 and without an explicit packed attribute the compiler will feel free to rearrange or pad the layout of your structure to suit the target.  So recompiling your code for M0, as the structure describes a protocol header layout or some other externally determined format, it mysteriously doesn't work at all.  (As a side note, if you write code for M4/M7 you intend to ever run on M0 it's a good idea to compile it for M0 as the target and set the STKALIGN and UNALIGN_TRP bits in CCR sometime during startup to make sure it still works and doesn't hard fault on an unaligned access.)
 

Offline janekm

  • Supporter
  • ****
  • Posts: 515
  • Country: gb
Re: C struct bitfields - size versus speed
« Reply #10 on: May 01, 2018, 05:15:57 am »
The compiler may not respect your requests for the physical layout of these fields unless you use one of the compiler-specific options to make it a "packed" struct. This can catch you out if these data structures are going to be stored or communicated to somewhere else (say if this was a network packet).
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 4263
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: C struct bitfields - size versus speed
« Reply #11 on: May 01, 2018, 05:29:19 am »

Really in to how smart the compiler is.
Some things in a language force a compiler to be smarter.

A dumb compiler might treat each bit as a byte.

A little smarter and compiler might do shifts and masks.

A really smart compiler would look at where a range of bits exist.
A really smart compiler would use bitbanding. But this is not available in all parts and cores, so compiler prefer to stay portable.
If you need to write&read a lot of random bits, you'd definitely want to use bitbanding.
 

Offline obiwanjacobi

  • Super Contributor
  • ***
  • Posts: 1013
  • Country: nl
  • What's this yippee-yayoh pin you talk about!?
    • Marctronix Blog
Re: C struct bitfields - size versus speed
« Reply #12 on: May 01, 2018, 07:23:09 am »
Some MCU's may support dedicated bit manipulation instructions. If you do not prevent the compiler from using them the perf will be good.

Here's my take on a bit array (C++) https://github.com/obiwanjacobi/atl/blob/master/Source/Code/ArduinoTemplateLibrary/BitArray.h
[2c]
Arduino Template Library | Zalt Z80 Computer
Wrong code should not compile!
 

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1650
  • Country: gb
Re: C struct bitfields - size versus speed
« Reply #13 on: May 01, 2018, 03:30:41 pm »
I'd be more concerned with the unaligned uint16_t; unless you explicitly flag the structure as packed you may find the compiler inserts a pad byte.

Why would that be? The first two fields, foo and bar are whole bytes (that is, byte-aligned), so what reason would a compiler have to insert a padding byte? Would it be something to do with platforms that have a large word size? Without telling the compiler to pack closely, do structure fields get padded out to native word size (e.g. 16/32/64 bits) by some compilers on some platforms? What are the reasons for such behaviour?

The compiler may not respect your requests for the physical layout of these fields unless you use one of the compiler-specific options to make it a "packed" struct. This can catch you out if these data structures are going to be stored or communicated to somewhere else (say if this was a network packet).

Is this worth doing just as a matter of routine, even if close packing happens by default? For instance, with AVR-GCC packing of the structure as-defined appears to be the default behaviour - unless my toolchain happens to be specifying some compiler flag by default to do this.
 

Offline glarsson

  • Frequent Contributor
  • **
  • Posts: 814
  • Country: se
Re: C struct bitfields - size versus speed
« Reply #14 on: May 01, 2018, 04:00:13 pm »
Why would that be? The first two fields, foo and bar are whole bytes (that is, byte-aligned), so what reason would a compiler have to insert a padding byte? Would it be something to do with platforms that have a large word size? Without telling the compiler to pack closely, do structure fields get padded out to native word size (e.g. 16/32/64 bits) by some compilers on some platforms? What are the reasons for such behaviour?
Most processor architectures likes padding to the word sizes. For misaligned variables MIPS, PowerPC, X86, etc will perform two memory accesses and and/shift/or to get the wanted content. Other processors like SPARC will not do this and just fail (SIGBUS in UNIX systems) as it will not be able to complete the instruction within one clock cycle.
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 28736
  • Country: nl
    • NCT Developments
Re: C struct bitfields - size versus speed
« Reply #15 on: May 01, 2018, 04:05:03 pm »
The compiler will create bit shifts & masks to get and store the bits from the struct so yes it will be a lot slower.
When you say "a lot slower", are you exaggerating for effect, or do you mean that?
It is slower because reading a value consisting of whole bytes just takes one instruction (on a modern 32 bit CPU). Taking a value apart into bits means having extra machine code for shifting & masking the values.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline FlyingHacker

  • Frequent Contributor
  • **
  • Posts: 807
  • Country: us
  • You're Doing it Wrong
Re: C struct bitfields - size versus speed
« Reply #16 on: May 01, 2018, 04:33:22 pm »
Often the best bet is to put some test code into a big loop and time the code different ways to see what the speed differences are with your specific compiler and hardware.

You also have to evaluate things like how much is the code used? Is it in a loop executed millions of times, or is it used once or twice? Weigh that against memory usage, ease of code reading, and ease of programming...
--73
 

Offline helius

  • Super Contributor
  • ***
  • Posts: 3696
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #17 on: May 01, 2018, 05:07:03 pm »
At one extreme, things such as the 68020 could do that in a single instruction: BFINS Dn, <ea>, [offset:width].  The VAX might well have been able to as well, though I can't recall.

The VAX has INSV src, offset, width, dest
But with little-endian addressing, so the bitfield's offset points to its LSB. With the MC68020, the offset of a bitfield points to its MSB.
In both architectures the offset can be up to 268435455 bytes away from the dest. This allows you to do bitfield ops on structures directly given the struct's address, instead of needing to calculate the address of the word containing the bitfield.
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 16289
  • Country: fr
Re: C struct bitfields - size versus speed
« Reply #18 on: May 01, 2018, 05:21:43 pm »
Bit fields are not very portable and are essentially sugar coating.

They may be faster on a given target if it has specific bit-manipulation instructions and the compiler generates code for them, but even so, the benefit would depend on the number of bits in the same word you are working on. If you have to manipulate several bits in the same word at the same time, it's usually much faster to use conventional integer bit operations such as &, |, ~, ^. Of course, a good compiler may optimize your code in that way with bit fields as well. So the answer is, you have to check the assembly output of your compiler in different situations to see for yourself.

Apart from optimization reasons, I tend to avoid bit fields because they are usually not portable, especially if you intend to have each specific bit field to have a specific position in the encapsulating data word, so compiling on a different target may get you completely different results.

 

Offline Howardlong

  • Super Contributor
  • ***
  • Posts: 5429
  • Country: gb
Re: C struct bitfields - size versus speed
« Reply #19 on: May 01, 2018, 06:37:50 pm »

Really in to how smart the compiler is.
Some things in a language force a compiler to be smarter.

A dumb compiler might treat each bit as a byte.

A little smarter and compiler might do shifts and masks.

A really smart compiler would look at where a range of bits exist.
A really smart compiler would use bitbanding. But this is not available in all parts and cores, so compiler prefer to stay portable.
If you need to write&read a lot of random bits, you'd definitely want to use bitbanding.

As I understand it, bit banding such as on the Cortex M3 and M4 aliases each bit in a word to multiple address spaces, allowing single bit atomic test/set/clear operation, speeding up single bit access on cores. This is no good for manipulating multiple bits concurrently.

If you have fields of more than one bit, then some cores offer bitfield instructions such as the Cortex M3 and M4, but these are register only instructions, so no good if you’re looking for atomic memory RMW, but they can be computationally efficient for the OP’s requirement.

The PIC16/PIC24/dsPIC offer efficient direct atomic single bit manipulation across the entire RAM address space, but it’s no benefit for multi-bit bit fields. However the PIC32 manages RMW differently by providing aliases for memory mapped peripheral registers so you can set, clear and invert as many bits as you like atomically within a machine word: this is only available on peripheral registers though, so no good for your own RAM based structures.
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 28736
  • Country: nl
    • NCT Developments
Re: C struct bitfields - size versus speed
« Reply #20 on: May 01, 2018, 06:52:52 pm »
Bit fields are not very portable and are essentially sugar coating.

They may be faster on a given target if it has specific bit-manipulation instructions and the compiler generates code for them, but even so, the benefit would depend on the number of bits in the same word you are working on. If you have to manipulate several bits in the same word at the same time, it's usually much faster to use conventional integer bit operations such as &, |, ~, ^. Of course, a good compiler may optimize your code in that way with bit fields as well. So the answer is, you have to check the assembly output of your compiler in different situations to see for yourself.
I disagree. Bit fields are a good way to safe space without making your code obfustigated. Let the C compiler deal with the nasty stuff!
Quote
Apart from optimization reasons, I tend to avoid bit fields because they are usually not portable, especially if you intend to have each specific bit field to have a specific position in the encapsulating data word, so compiling on a different target may get you completely different results.
Sharing structs between platforms/compilers (for example to communicate) isn't a good thing to do anyway because of alignment issues and packing strategies. I always tell people not to map structs and plain C types onto byte buffers because it will go wrong at some point.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline andyturk

  • Frequent Contributor
  • **
  • Posts: 895
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #21 on: May 02, 2018, 12:57:11 am »
Sharing structs between platforms/compilers (for example to communicate) isn't a good thing to do anyway because of alignment issues and packing strategies. I always tell people not to map structs and plain C types onto byte buffers because it will go wrong at some point.
:-+
 

Online NorthGuy

  • Super Contributor
  • ***
  • Posts: 3325
  • Country: ca
Re: C struct bitfields - size versus speed
« Reply #22 on: May 02, 2018, 01:38:11 am »
Sharing structs between platforms/compilers (for example to communicate) isn't a good thing to do anyway because of alignment issues and packing strategies. I always tell people not to map structs and plain C types onto byte buffers because it will go wrong at some point.

Look at the file formats (ELF, ZIP etc.), or communication protocols (such as TCP/IP). They all work with records which are essentially mapped C structures. They work quite well across platforms without causing issues. Of course, when these structures were created they were aligned by hand - there's no need for padding (and if it is then the padding is explicitly added). Byte/bit order is also fixed by design. The designers were clever about this. But so can you.

Besides, if you don't waste your time packing and unpacking structures, you will be more productive - you'll write more :)

 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 16289
  • Country: fr
Re: C struct bitfields - size versus speed
« Reply #23 on: May 02, 2018, 08:30:39 am »
Sharing structs between platforms/compilers (for example to communicate) isn't a good thing to do anyway because of alignment issues and packing strategies. I always tell people not to map structs and plain C types onto byte buffers because it will go wrong at some point.

Look at the file formats (ELF, ZIP etc.), or communication protocols (such as TCP/IP). They all work with records which are essentially mapped C structures. They work quite well across platforms without causing issues. Of course, when these structures were created they were aligned by hand - there's no need for padding (and if it is then the padding is explicitly added). Byte/bit order is also fixed by design. The designers were clever about this. But so can you.

Besides, if you don't waste your time packing and unpacking structures, you will be more productive - you'll write more :)

Absolutely. Forcing the alignment of struct members is usually straightforward and reasonably portable. All C compilers I have come across for targets from 8-bitters to 64-bit CPUs had means of defining alignment in non-ambiguous ways.
On some platforms, you still have to worry about unaligned access - they can be either slower (on most targets, which is why the native default alignment usually matches the target's data bus width) or lead to an exception (such as on the 16-bit PIC micros). That can be worked around of course.

 

Offline andyturk

  • Frequent Contributor
  • **
  • Posts: 895
  • Country: us
Re: C struct bitfields - size versus speed
« Reply #24 on: May 02, 2018, 08:45:11 am »
Besides, if you don't waste your time packing and unpacking structures, you will be more productive - you'll write more :)
You might have endian conversions in a portable format, so a direct mapped structure could still have packing/unpacking.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf