Author Topic: Using UNION to convert data types  (Read 3627 times)

0 Members and 1 Guest are viewing this topic.

Offline JustMeHereTopic starter

  • Frequent Contributor
  • **
  • Posts: 826
  • Country: us
Using UNION to convert data types
« on: October 12, 2018, 04:12:17 am »
I've seen a couple post lately where people are asking: How do I read 16, 24, or 32 bits from a SPI (any serial reall) with an 8-bit micro?

A trick I use is a union.

union data
{
    uint8_t       busData[4];
    uint32_t     combined;
}

Now you can for loop the data into the union and read it as a 32 bit without having to waste effort bit shifting.

I haven't tried it yet, but I would expect that if you combined a struct with a union you could use the same technique to read structures from a serial data bus too.
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: Using UNION to convert data types
« Reply #1 on: October 12, 2018, 05:30:50 am »
Byte ordering will be a problem for some CPUs, so code will have portability issues.

 On a lot of CPUs byte shifts are not very expensive anyhow, or have a 'byte swap word' instruction.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline rjp

  • Regular Contributor
  • *
  • Posts: 124
  • Country: au
Re: Using UNION to convert data types
« Reply #2 on: October 12, 2018, 05:36:13 am »
yeh watch byte ordering and potentially also the packing  - eg: arm 32 always does 4 byte offsets even when you use 1 or 2 byte types.
 
 

Offline dmills

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: Using UNION to convert data types
« Reply #3 on: October 12, 2018, 12:27:19 pm »
That is called 'type punning' and violates strict aliasing, for all that gcc explicitly allows it providing it is done via a union.

Do be very careful of bitfields in this sort of thing, as they are completely implementation defined.

Regards, Dan.
 

Offline JustMeHereTopic starter

  • Frequent Contributor
  • **
  • Posts: 826
  • Country: us
Re: Using UNION to convert data types
« Reply #4 on: October 24, 2018, 05:21:23 am »
That is called 'type punning' and violates strict aliasing, for all that gcc explicitly allows it providing it is done via a union.

Do be very careful of bitfields in this sort of thing, as they are completely implementation defined.

Regards, Dan.

Totally understand and agree, but in micro programming I am never going to put portability above actual bit slapping.
 

Offline dmills

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: Using UNION to convert data types
« Reply #5 on: October 24, 2018, 10:47:12 am »
On GCC derived compilers you might find that declaring such structs "_attribute__((__packed__))" is helpful as it specifies that no padding is to be used within the memory layout.

All very implementation defined, but working down at the metal can be like that sometimes.

Regards, Dan.
 
The following users thanked this post: Mechatrommer

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7192
  • Country: fi
    • My home page and email address
Re: Using UNION to convert data types
« Reply #6 on: October 25, 2018, 07:10:47 am »
Type punning via an union is actually supported by the C standard since C99, so it's not just a GCC feature.

However, it is explicitly forbidden in C++, which means that you should not try to use it in the Arduino environment, because it uses C++ (GNU g++).

In both C++ and C (since C99), functions marked static inline are as fast as macros. They not only provide type safety (that macros don't do; macros accept anything as a parameter), they also help the compiler to generate better code when you enable optimizations.

I build all my code using GCC -Wall -O2 flags at minimum. It enables all typical warnings (some of them not so useful; -Wunused-variable being one), and enables optimization. (I've compiled a lot of code using similar settings, as I used to build my own "distro" from scratch; see linuxfromscratch.org. I've also maintained computational clusters, where the correctness of the simulations is important, and compiled and optimized some simulators for them. And writing highly parallel and distributed simulators for computational materials physics is what I'd like to do as a career. So I do claim to have a lot of experience doing this, and that's what I recommend.)

Because of the various quirks in the details, rather than storing a datagram (or file header) in a structure, I do recommend using buffers, and static inline accessor functions. For 32-bit architectures in C, for example
Code: [Select]
#include <stdint.h>

typedef union {
    float       f[1];
    uint32_t    u32[1];
    int32_t     i32[1];
    uint16_t    u16[2];
    int16_t     i16[2];
    uint8_t     u8[4];
    int8_t      i8[4];
} word32;

/* Pack u32 from different byte orders */
static inline uint32_t  pack_u32_1234(const uint8_t src[4]) { return ((word32){ .u8 = { src[0], src[1], src[2], src[3] } }).u32[0]; }
static inline uint32_t  pack_u32_3412(const uint8_t src[4]) { return ((word32){ .u8 = { src[2], src[3], src[0], src[1] } }).u32[0]; }
static inline uint32_t  pack_u32_2143(const uint8_t src[4]) { return ((word32){ .u8 = { src[1], src[0], src[3], src[2] } }).u32[0]; }
static inline uint32_t  pack_u32_4321(const uint8_t src[4]) { return ((word32){ .u8 = { src[3], src[2], src[1], src[0] } }).u32[0]; }

#if __BYTE_ORDER-0 == 1234
// Little-endian architecture
#define  pack_le32(...) pack_u32_1234(__VA_ARGS__)
#define  pack_be32(...) pack_u32_4321(__VA_ARGS__)
#elif __BYTE_ORDER-0 == 4321
// Big-endian architecture
#define  pack_le32(...) pack_u32_4321(__VA_ARGS__)
#define  pack_be32(...) pack_u32_1234(__VA_ARGS__)
#elif __BYTE_ORDER-0 == 3412
// PDP-endian architecture
#define  pack_le32(...) pack_u32_3412(__VA_ARGS__)
#define  pack_be32(...) pack_u32_2143(__VA_ARGS__)
#elif __BYTE_ORDER-0 == 2143
// Inverse PDP-endian architecture
#define  pack_le32(...) pack_u32_2143(__VA_ARGS__)
#define  pack_be32(...) pack_u32_3412(__VA_ARGS__)
#else
#Error Unknown __BYTE_ORDER!
#endif
produces macro-like inline functions that typically generate excellent code for constructing a 32-bit unsigned integer from four 8-bit unsigned integers. (I'm not sure I numbered the macros right for each architecture, though.) The code tends to be better than code accessing memory via an union, although you can certainly construct edge cases either way. Note, however, that accessing unaligned memory is problematic on many architectures, but the above accessor functions are not affected by that. Because the above functions are static inline, they do not add to the binary size unless used.

If you have a pointer unsigned char *p to a buffer position, and wanted to obtain the 32-bit unsigned int stored in big-endian byte order at 5 bytes following p, you'd call pack_be32((const uint8_t *)(p + 5)). The source is portable between wildly different architectures, and the compiled binary code is quite good (depending on your optimization settings a bit, of course).
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf