One simple alternative is to use inline accessor functions.
Compiler Explorer example for Cortex-M3.
Because of the various issues with packed and/or unaligned structures, I do tend to use such accessors instead of structures, when transferring information. (I do use packed structures, often with anonymous unions as in the above example, for example for type punning and to simplify access to memory-mapped peripherals. In other words, I'm not saying one thing is bad and another is good, I'm only describing my preferred tool choices here.)
Because pointer arithmetic is not well defined for void pointers, I find it easiest to use an
unsigned char pointer to the beginning of the buffer, even though the inline accessor functions do take a void pointer, i.e. their interface is basically
type get_type(const void *const ptr), or occasionally
type get_type(const void *const ptr, const unsigned char byte_order_change), where each bit
i in
byte_order_change corresponds to swapping 2
i-byte groups, allowing run-time adaptation to different byte orders (in files or messages) based on known prototype multibyte values. (For each type requiting 2
k bytes of storage, there are also 2
k possible byte orders, even though two –– in order of ascending or descending mathematical significance, i.e. little or big endian –– are normally encountered.)
(Note that in C function declarations,
static ≡
static inline. I personally have the habit of marking internal/local functions as
static, and accessor functions as
static inline, that's all. It helps in code maintenance, as it helps me structure the code better, and reduces the cognitive load with the ubiquitous
"hey, what does this
function do?" moments.)
I suspect, but have not verified, that
typedef struct {
union {
uint16_t u16;
int16_t i16;
};
} __attribute__((__packed__)) unaligned_16;
typedef struct {
union {
uint32_t u32;
int32_t i32;
float f32;
};
} __attribute__((__packed__)) unaligned_32;
typedef struct {
union {
uint32_t u64;
int32_t i64;
double f64;
};
} __attribute__((__packed__)) unaligned_64;
on current C compilers does yield 16-, 32-, and 64-bit number types one can use for unaligned accesses safely, with the extra "cost" of having to type
(normal reference to structure member).i32 or whatever subtype one wants.
In other words, instead of using a packed structure with standard types, you can use a structure with unaligned members, defined as packed structures with effectively just one member. (I need type punning between types of the same size so often that I prefer to bake that in to these 'single effective member packed structures', as they cost nothing. Even if you used a single member, you'd still need to refer to that member; the anonymous union just lets you type-pun at that time if you wish.)
(EDITED to clarify: the above still requires global and static structures to be accessed via a temporary pointer. Just declaring static or global structures with aforementioned
unaligned_nn type members does not stop the compiler from combining accesses to their members; as the static or global declaration effectively "neutralizes" the
packed attribute.)
With C11
_Generic, this could be extended to a macro that expands to an accessor call ensuring exactly one unaligned-safe access is done to the parameter object, i.e.
typedef struct { int a, b; } MyStruct; MyStruct *src; MyFn(UNALIGNED_DEREF_ONCE(&(src->a)), UNALIGNED_DEREF_ONCE(&(src->b)));although folding the address-of into the macro would shorten the code (but "violates" the normal passing-by-value logic in C):
MyFn(UNALIGNED_ONCE(src->a), UNALIGNED_ONCE(src->b));The idea is that
_Generic is used to choose the size, byte order, and return type, by choosing a specific inline accessor function. The inline accessor functions take a pointer to the object whose value they need to return. For native byte order types, they cast that pointer to a (
volatile) pointer to the suitable packed structure type, and obtain the value.
As to why the "exactly one access", that (in conjuction with the C abstract machine rules) should ensure multiple unaligned accesses are not folded into a single access. You ensure that (as much as one can in C, without resorting to assembly) by having the accessor function use a pointer with
volatile type to the anonymous union structure type. The reason for it working is that the parameter to the accessor function is not volatile itself. This behaviour is not dictated by the C standard in any way, and is just a consequence of how current compilers implement the abstract machine described in the C standard, and specifically how they perform optimizations (especially at the abstract syntax tree level).
For non-native byte order types with "exactly one access",
volatile is replaced by a memory copy to an array of
unsigned char (or a structure with/or union that contains the desired type and an array of unsigned char); the unsigned char array is permuted according to the byte order change, and then the resulting type is returned via type-punned union access.
In any case, personally prefer a nested structure of single-member unaligned/packed structures instead, i.e.
typedef struct { unaligned_32 a, b; } MyStruct; MyStruct *src; MyFn((src->a).i32, (src->b).i32);noting that whether
MyStruct is declared packed or not is irrelevant here; it is the
members that are declared "unaligned". This is as close to
*(type)__builtin_unaligned(pointer) I can currently get. (EDITED to add: the use of a pointer for the access is mandatory. If you declare
MyStruct foo; then calling
MyFn(foo.a.i32, foo.b.i32) may lead to combined loads. To fix, use
const volatile MyStruct *src = &foo; and
MyFn((src->a).i32, (src->b).i32);.)
Experimenting
with this on Compiler Explorer suggests this works for all the compilers that it supports –– although I only tested a few examples I personally care about, and would thus appreciate a heads-up if anyone finds a counterexample (an architecture where the above, or the test example, fails to generate safe-unaligned-access code). The only non-standard feature needed is the
packed attribute support.