Long story short: Write your C software in a way so it doesn't matter how data is stored in memory.
I do not agree with you. You have a specific embedded platform, so you can use some specific memory layouts.
Always make structs a multiple of the word size, this prevents the optimizer from mis-aligning the struct since it won't save space anymore. You'll lose some padding bytes, but you should have plenty. You wanted performance right?
And please don't do this:
uint8_t
uint32_t
uint8_t
You'll lose 6 bytes.
This _attribute_((aligned(4))) is a must when you're inserting SIMD assembler inlines. You might even want to use the
register keyword before it. You might also want to align constants in flash too. It is usually 128 bit or something larger.
But that depends on the prefetching and caching configuration.