Let's say you have an architecture with two different type of UARTs, so you describe their register layout with
struct uart_regs and
struct usart_regs. Let's further say that their layouts are utterly incompatible, but they both have
rx and
tx fields, and you want to be able to use these in an array fashion.
In a header file, you declare the structure types, and then the UARTs/USARTs available. One common approach is to define their addresses in a linker script, and the number in a preprocessor macro:
#include <stdint.h>
struct uart_regs {
uint8_t rx;
uint8_t tx;
uint8_t status;
uint8_t padding[5];
};
struct usart_regs {
uint8_t status;
uint8_t padding[1];
uint8_t rx;
uint8_t tx;
};
#define UARTS 3
#define USARTS 2
extern volatile struct uart_regs uart_array[UARTS];
extern volatile struct usart_regs usart_array[USARTS];
In the same header file, declare the RX and TX arrays like this:
#define TXS (UARTS+USARTS)
static volatile uint8_t *const tx_ptr[TXS] = {
&(uart_array[0].tx),
&(uart_array[1].tx),
&(uart_array[2].tx),
&(usart_array[0].tx),
&(usart_array[1].tx),
};
#define TX(i) (*tx_ptr[i])
Again, the
TX(number) macro expands to the
tx structure member of the
number'th uart or usart structure, and can be used on either side of an assignment (it is a valid "lvalue").
Note that because of the
const, the compiler optimizes accesses through the pointer array and macro to direct accesses to the structures themselves, so that when you compile object files, those object files will not have a symbol named
tx_ptr, nor any references to such.
So, while it may look like you get copies of that array in your object files, the
const will save you from that, assuming you enable optimizations (
-Os,
-O1,
-O2;
-Og does not unroll the loop and will use the array, so with that you will get a copy of the array in each object file compiled with
-Og that includes this header file).
Let's test the above:
void tx_all(uint8_t value) {
for (int i = 0; i < TXS; i++)
TX(i) = value;
}
Using GCC 5.4.1 through 14.2.0,
-Os or
-O2 -march=armv7e-m -mthumb (aapcs32, 32-bit arm-none-eabi),
tx_all() compiles to
tx_all:
ldr r3, .L3
ldr r2, .L3+4
strb r0, [r3, #1]
strb r0, [r3, #9]
strb r0, [r3, #17]
strb r0, [r2, #3]
strb r0, [r2, #7]
bx lr
.L3:
.word uart_array
.word usart_array
which is basically as optimal as you can make this even if you wrote it by hand (except perhaps for the
order of the above instructions), given that the address of
uart_array and
usart_array are not known yet at the time of this compilation.
Even if you reorder the elements in the
tx_ptr array, the generated code stays optimal: only the store machine instruction order will change.
The purpose of the
TX(number) macro is to simply hide the particular implementation detail, by doing the dereferencing for you. This allows you to modify the implementation, without having to modify any of the
TX()-using code. We can discuss whether it is appropriate abstraction here, though. To me, it is just a tiny cost ("odd-looking" code, like assigning values to a function?) to save me from complicated search-and-replace process, if I decide to change the implementation behind it for any reason.