-
So I'm using a part with 5 UARTS. The tx registers ( for example) are called U1TXBUF, U2TXBUF etc. in the manufacturer's headers
Is there a clever & code-efficient way to be able to access these (for read and write) like an array, UTXREG[0..4] Where the index is a variable, not a constant ?
-
It can be done with an array of pointers but you have to use the * in front of it to access the actual value.
uint8_t *uarts_tx[5];
//Setup
uarts_tx[0] = U1TXBUF;
uarts_tx[1] = U2TXBUF;
uarts_tx[2] = U3TXBUF;
uarts_tx[3] = U4TXBUF;
uarts_tx[4] = U5TXBUF;
//Usage
*uarts_tx[1] = 0x0A;
You can create an array of UART peripherals by using structures and fill in the memory gaps with reserved space like it is done in STM32 header files.
//Universal Serial Bus Full Speed Device
typedef struct
{
__IO uint16_t EP0R; //USB Endpoint 0 register, Address offset: 0x00
__IO uint16_t RESERVED0; //Reserved
__IO uint16_t EP1R; //USB Endpoint 1 register, Address offset: 0x04
__IO uint16_t RESERVED1; //Reserved
__IO uint16_t EP2R; //USB Endpoint 2 register, Address offset: 0x08
__IO uint16_t RESERVED2; //Reserved
__IO uint16_t EP3R; //USB Endpoint 3 register, Address offset: 0x0C
__IO uint16_t RESERVED3; //Reserved
__IO uint16_t EP4R; //USB Endpoint 4 register, Address offset: 0x10
__IO uint16_t RESERVED4; //Reserved
__IO uint16_t EP5R; //USB Endpoint 5 register, Address offset: 0x14
__IO uint16_t RESERVED5; //Reserved
__IO uint16_t EP6R; //USB Endpoint 6 register, Address offset: 0x18
__IO uint16_t RESERVED6; //Reserved
__IO uint16_t EP7R; //USB Endpoint 7 register, Address offset: 0x1C
__IO uint16_t RESERVED7[17]; //Reserved
__IO uint16_t CNTR; //Control register, Address offset: 0x40
__IO uint16_t RESERVED8; //Reserved
__IO uint16_t ISTR; //Interrupt status register, Address offset: 0x44
__IO uint16_t RESERVED9; //Reserved
__IO uint16_t FNR; //Frame number register, Address offset: 0x48
__IO uint16_t RESERVEDA; //Reserved
__IO uint16_t DADDR; //Device address register, Address offset: 0x4C
__IO uint16_t RESERVEDB; //Reserved
__IO uint16_t BTABLE; //Buffer Table address register, Address offset: 0x50
__IO uint16_t RESERVEDC; //Reserved
} USB_TypeDef;
This is the structure for the USB peripheral, but it shows the intention. Use typedefs to name a structure and then incorporate it in a new structure to get an array of them.
//Structure for Buffer Descriptor Table
typedef struct
{
PMA_ENTRY TX_ADDRESS;
PMA_ENTRY TX_COUNT;
PMA_ENTRY RX_ADDRESS;
PMA_ENTRY RX_COUNT;
} BTABLE_ENTRY;
//Structure for endpoint descriptors
typedef struct
{
BTABLE_ENTRY EPD[8];
} BTABLE_TypeDef;
Again USB stuff, but it shows the intention.
Use a #define to map the array of structures to the actual memory.
#define USB_BASE 0x40005C00
#define USB ((USB_TypeDef *)USB_BASE)
-
Yeah, since the registers are not adjacent in memory, you need to add one layer of indirection, i.e. use array of pointers, and then remember to dereference that pointer when accessing.
-
It can be done with an array of pointers but you have to use the * in front of it to access the actual value.
uint8_t *uarts_tx[5];
//Setup
uarts_tx[0] = U1TXBUF;
uarts_tx[1] = U2TXBUF;
uarts_tx[2] = U3TXBUF;
uarts_tx[3] = U4TXBUF;
uarts_tx[4] = U5TXBUF;
//Usage
*uarts_tx[1] = 0x0A;
Presumably I would also need to declare it volatile
volatile uint8_t *uarts_tx[5];
any reason I couldn't then add, say,
#define UARTTX *uarts_tx
and use UARTX[n] ?
-
Yes,
volatile uint8_t * uart_tx[5]
so that the target type itself is volatile; the pointer will not be and doesn't need to be.
#define is just fancy text replacement but I would definitely not do what you suggest (hide the pointer-ness) because that's a great way to shoot yourself in the foot as you don't see you are dealing with a pointer and dereferencing it. Why not expose what is really happening everywhere.
-
Also don't forget to initialize the array with e.g.:
volatile uint8_t *uarts_tx[5] = {
U1TXBUF, U2TXBUF, U3TXBUF, U4TXBUF, U5TXBUF
};
somewhere globally or as a static variable inside a function. Otherwise it will keep populating the array of pointer values everytime it enters the function..
The array of pointers is a good readable and flexible method.
The struct alternative as presented by pcprogrammer does have 1 advantage though: the compiler knows exactly what the memory layout of the I/O looks like and so it possibly can do its own pointer arithmetic. For the array of pointers it just sees a bunch of magic constants that it has to load and then use.
I don't know exactly which part you're using. And this is only a very slight 'optimization' with an extra qualification. But if you use a part such as the PIC32MX250, then its uarts are nicely colocated in the I/O space at virtual addresses 6020h, 6220h, 6420h, 6620h, 6820h, etc.
Just from observation we could say the memory address we need is: 6020h + idx*200h (with idx = 0 .. 4). Writing it as-is is a bit of a hack. But we can craft a structure that capture this PIC32s I/O space and then use that:
typedef struct {
volatile uint32_t MODE; // 6000h
uint32_t _1; uint32_t _2; uint32_t _3;
volatile uint32_t STA; // 6010h
uint32_t _4; uint32_t _5; uint32_t _6;
volatile uint8_t TXREG; // 6020h
uint32_t _7; uint32_t _8; uint32_t _9;
volatile uint8_t RXREG; // 6030h
uint32_t _10; uint32_t _11; uint32_t _12;
volatile uint32_t BRG; // 6040h
uint32_t _13; uint32_t _14; uint32_t _15;
// 6050h - 61FCh (108x4bytes): unused
uint32_t _unused[108];
} UARTPeripheral;
//...
UARTPeripheral* uarts = (UARTPeripheral*) U1MODE;
uarts[idx].TXREG = data;
It also gets rid of the nasty * before using a register. On MIPS4K CPU, it is ever so marginally (https://godbolt.org/z/M84rfPfGd) faster and smaller (-1 instruction, no more constants) with some dummy values for the register locations. On other architectures, YMMV.
Note however, untested..
BIG downside: there is no sanity check to see if the indexed UART even exists. And this only works because the 5 UARTs are located sequentially inside the memory. On ARM parts this may not be the case at all, so then the array of pointers allows for more flexibility (e.g. you could also reshuffle the order of UARTs if you need to).
-
#define is just fancy text replacement but I would definitely not do what you suggest (hide the pointer-ness) because that's a great way to shoot yourself in the foot as you don't see you are dealing with a pointer and dereferencing it. Why not expose what is really happening everywhere.
Like Siwastaja wrote, it is risky to do so. With the structure example, which I use a lot in my code, it is clear that it is pointing to a structure by the selection of the structure elements. (-> for when dealing with a pointer to a structure, . (dot) when dealing with the actual structure)
uart[3]->tx = 0x0A;
-
So I'm using a part with 5 UARTS. The tx registers ( for example) are called U1TXBUF, U2TXBUF etc. in the manufacturer's headers
Is there a clever & code-efficient way to be able to access these (for read and write) like an array, UTXREG[0..4] Where the index is a variable, not a constant ?
Are the transmit buffers mapped at contiguous addresses?
-
Yes,
volatile uint8_t * uart_tx[5]
so that the target type itself is volatile; the pointer will not be and doesn't need to be.
#define is just fancy text replacement but I would definitely not do what you suggest (hide the pointer-ness) because that's a great way to shoot yourself in the foot as you don't see you are dealing with a pointer and dereferencing it. Why not expose what is really happening everywhere.
Actually I disagree - If your hardware regs are a mix of normal for non-array regs, and pointers for arrays, then there is more scope for errors than if they are all used consistently.
The array elements will only ever be read from and written to in the same way as normal register definitions, so I can't see why there would be any risk hiding the actual mechanism, and would make the code more obvious. More so if the case matches the normal fixed ones.
e.g.
for(i=0;i!=4;i++) UTXBUF[i]=buffer[i][j];
Seems to me to be more obvious than if it was using *utxbuf[]
In what situations might this cause any issues ?
Especially when I am the only one who will ever need to understand this, possibly years ahead when the next project re-uses the code.
BTW the definition in the header file is
#define U1TXREG U1TXREG
extern volatile uint32_t U1TXREG __attribute__((section("sfrs"), address(0xBF806020)));
Not sure what that #define is doing...
OK second complication - I'd also need to look at status bits in a register array.
The normal way of doing this would be something like
if(U1STAbits.UTXBF)...
which I'd like to look like
if(USTAbits[n].UTXBF)...
I don't think there is ever a need to set / clear status bits, and clearing interrupt bits gets messy as the bits are in different positions in different registers - probably need to do something like :
IFSCLR[n]=_IFS_UTXIF_MASK[n];
(This is done differently to using the <register>.bits syntax as the XC32 compiler is too dumb to substitute the hardware bit set/clear instructions, and I won't even start on that pain that caused me one occasion debuggin a huge installation in Hong Kong when I had to go out to buy an oscilloscope to use for a couple of hours)
This is defined in the header files as #define U1STAT U1STAT
typedef union {
struct {
uint32_t URXDA:1;
uint32_t OERR:1;
uint32_t FERR:1;
uint32_t PERR:1;
uint32_t RIDLE:1;
uint32_t ADDEN:1;
uint32_t URXISEL:2;
uint32_t TRMT:1;
uint32_t UTXBF:1;
uint32_t UTXEN:1;
uint32_t UTXBRK:1;
uint32_t URXEN:1;
uint32_t UTXINV:1;
uint32_t UTXISEL:2;
uint32_t ADDR:8;
uint32_t ADM_EN:1;
};
struct {
uint32_t :6;
uint32_t URXISEL0:1;
uint32_t URXISEL1:1;
uint32_t :6;
uint32_t UTXISEL0:1;
uint32_t UTXISEL1:1;
};
struct {
uint32_t :14;
uint32_t UTXSEL:2;
};
struct {
uint32_t w:32;
};
} __U1STAbits_t;
extern volatile __U1STAbits_t U1STAbits __asm__ ("U1STA") __attribute__((section("sfrs"), address(0xBF806010)));
I have no idea what's going on there with that __asm__ ....
-
So I'm using a part with 5 UARTS. The tx registers ( for example) are called U1TXBUF, U2TXBUF etc. in the manufacturer's headers
Is there a clever & code-efficient way to be able to access these (for read and write) like an array, UTXREG[0..4] Where the index is a variable, not a constant ?
Are the transmit buffers mapped at contiguous addresses?
No, they are interleaved with UART status registers, and even if they were consecutive I wouldn't like to rely on that - the point is to create a code framework that works with an arbitary number of UARTs.
Even if the current part maps them at consecutive addresses, other, future larger parts might not, e.g. on a fancy new part with 10 UARTS they decided to put UART6 onwards at a different start address because they ran out of space after UART5
-
So I'm using a part with 5 UARTS. The tx registers ( for example) are called U1TXBUF, U2TXBUF etc. in the manufacturer's headers
Is there a clever & code-efficient way to be able to access these (for read and write) like an array, UTXREG[0..4] Where the index is a variable, not a constant ?
Are the transmit buffers mapped at contiguous addresses?
No, they are interleaved with UART status registers, and even if they were consecutive I wouldn't like to rely on that - the point is to create a code framework that works with an arbitary number of UARTs.
Even if the current part maps them at consecutive addresses, other, future larger parts might not, e.g. on a fancy new part with 10 UARTS they decided to put UART6 onwards at a different start address because they ran out of space after UART5
OK, fair enough.
-
Use
volatile uart_t *const uart[4] = { &uart0, &uart1, &uart2, &uart3 };
The const is key, because it tells the compiler that the pointers in the array will not be modified (because it is on the right side of the asterisk). The left side tells what they point to, and those are volatile, so accesses to the pointed-to uart_t structures are volatile and mutable.
This way, if you enable optimizations, using uart[n] where the value of n is known at compile time (being a constant or macro or the loop variable in a short loop when loop unrolling is enabled), the compiler will generate the code that uses the structure directly.
You can see the difference in the following example:
#include <stdint.h>
typedef struct {
uint8_t rx;
uint8_t tx;
uint8_t status;
uint8_t padding[5];
} uart_t;
extern volatile uart_t uart0, uart1, uart2;
volatile uart_t *const uart[3] = { &uart0, &uart1, &uart2 };
#define uarts (sizeof uart / sizeof uart[0])
void uart_tx_all(uint8_t value) {
for (uint_fast8_t i = 0; i < uarts; i++)
uart[i]->tx = value;
}
With sufficient optimizations, uart_tx_all() compiles to three ldr+strb pairs on ARMv7e-m using Thumb, loading the structure address (directly using uart0, uart1, uart2 symbols; not through the uart array at all), storing value to offset 1 within each structure, plus a final bx lr.
If you omit the const, then for ARMv7e-m and some other architectures you get quite different code. It will first load the uart array address, then load the address of each structure through that pointer, and finally store value to offset 1 relative to those pointers. This is because the compiler cannot assume the current pointers in the array are the initialized values. Only first one is a direct load, the three other loads are indirect (relative to the first).
This is one of the rare examples where adding const can make a measurable difference to the generated code.
-
It depends how they are located in memory. Start by looking up the addresses of UART modules and individual registers, then use C to describe what you see.
Most likely this is an array of stucts, each struct containing all registers for a single UART module. If you declare it as such, you can access the structs by index (or by pointer if you prefer). No need to create any indirections or use extra memory.
-
It depends how they are located in memory. Start by looking up the addresses of UART modules and individual registers, then use C to describe what you see.
Most likely this is an array of stucts, each struct containing all registers for a single UART module. If you declare it as such, you can access the structs by index (or by pointer if you prefer). No need to create any indirections or use extra memory.
If all the UARTs have the same memory layout and if they are contiguous in memory then this is of course the best thing to do. But that is a lot of ifs. It is not unheard of to see heterogeneous peripherals; ST especially loves to play such games, so you may have three similar UARTS and then a fourth different UART and three different types of DMAs.
Discontinuous IO is even more common in microcontrollers. Just having different amount of gap ruins the idea. So indirection is often needed, but it's not a bad thing. The array of const pointers as shown by Nominal is very simple and efficient.
-
It depends how they are located in memory. Start by looking up the addresses of UART modules and individual registers, then use C to describe what you see.
Most likely this is an array of stucts, each struct containing all registers for a single UART module. If you declare it as such, you can access the structs by index (or by pointer if you prefer). No need to create any indirections or use extra memory.
If all the UARTs have the same memory layout and if they are contiguous in memory then this is of course the best thing to do. But that is a lot of ifs. It is not unheard of to see heterogeneous peripherals; ST especially loves to play such games, so you may have three similar UARTS and then a fourth different UART and three different types of DMAs.
Discontinuous IO is even more common in microcontrollers. Just having different amount of gap ruins the idea ...
Sure. That's why you need to look at addresses first.
-
Use
volatile uart_t *const uart[4] = { &uart0, &uart1, &uart2, &uart3 };
The const is key, because it tells the compiler that the pointers in the array will not be modified (because it is on the right side of the asterisk). The left side tells what they point to, and those are volatile, so accesses to the pointed-to uart_t structures are volatile and mutable.
Yep.
Now, there may be cases for using such an array, but it may not be so obvious. It has a benefit if you, say, must iterate through a number of identical peripherals.
But if you only access them individually, it's not obvious to see a benefit of:
'uart[1]' compared to 'uart1'.
Can be potentially interesting for easily "remapping" peripherals in your firmware, possibly dynamically (if that makes any sense for your application). If statically, it looks harder to justify.
Just a thought - there may be specific use cases that justify it, but just mentioning that overly "abstracting" when the abstraction gives no abstraction at all may not be all that useful.
OTOH, writing functions that take a pointer to a peripheral struct rather that directly "hard code" the peripheral pointer in it can be a good idea for making them more versatile.
So that could look like:
void DataSend(volatile uart_t * const uart, size_t n, const uint8_t data[n])
{
...
}
-
But if you only access them individually, it's not obvious to see a benefit of:
'uart[1]' compared to 'uart1'.
The benefit is that the index can be a variable, so you can have code that works with different numbers of uarts with no code changes, and the same code can handle every UART in a loop
-
You will almost certainly need to access other uart registers, so the way I usual orgainise peripherals is to typedef a stucture that maps over the device register set. Here's one for an old project, a Mitsubishi 16 bit cpu:
typedef struct {
U8 u8SpecModeReg4; /* Special mode register 4 */
U8 u8SpecModeReg3; /* .. .. .. 3 */
U8 u8SpecModeReg2; /* .. .. .. 2 */
U8 u8SpecModeReg; /* Special mode register */
U8 u8TxRxMode; /* Transmit/receive mode register */
U8 u8BitRate; /* Bit rate generator */
/* Even address, no padding to 16 bit boundary */
U16 u16TxDataBuf; /* Transmit buffer register */
U8 u8TxRxCtl0; /* Transmit/receive control register 0 */
U8 u8TxRxCtl1; /* .. .. .. .. 1 */
/* Even address, no padding to 16 bit boundary */
U16 u16RxDataBuf; /* Receive data buffer register */
U8 u8TxIntCtl; /* Transmit interrupt control register */
U8 u8RxIntCtl; /* Receive interrupt control register */
U8 u8BusColIntCtl; /* Bus collision interrupt control register */
} UART_REGISTER;
A lot of peripheral device registers are contiguous in memory, but some need padding in the structure, so the element maps correctly. Can be abstracted further, if the register set is alll over the address spece, via pointers to a register.
Then, assuming you know the lowest hardware register address of the device, you can then declare a pointer to a structure at that address, to access any register transaparently via that pointer. It's an old idea, but provides a bit of abstraction for the device itself and becomes a common access point for the device. For a set of uarts, just declare an array of structures.
-
OTOH, writing functions that take a pointer to a peripheral struct rather that directly "hard code" the peripheral pointer in it can be a good idea for making them more versatile.
Also, you'll generally get better machine code if you pass a pointer to the peripheral struct rather than an index into an array of those peripherals.
I do not recall if I've ever used/needed an array of structures for peripherals, as I tend to pass and store the pointer to the particular peripheral structure instead. (If configured at build time, then I tend to use a preprocessor macro named according to the use case or purpose (say, LOGGING_UART or COMMAND_UART) that evaluates to the actual peripheral structure. Pointers I only use if it varies at run time, or more than one peripheral structure is used with the same functions.) I don't recall if I've ever needed a loop over a set of similar peripherals, except maybe during bootup initialization (where the efficiency difference is absolutely neglible anyway).
I just happen to know how I'd do it if I needed to, because I like to test and verify experimentally a lot of stuff that happens to pop into my mind, regardless of whether I have an use case in mind or not. You know, for research purposes.
-
"Also, you'll generally get better machine code if you pass a pointer to the peripheral struct rather than an index into an array of those peripherals."
Yes, typically passed in a register. It's often more efficient to use a structure and pointer for any function that has more than two or three parameters, and can be used to hold (private) variables common to several layers of functions.
-
You can see the difference in the following example:
#include <stdint.h>
typedef struct {
uint8_t rx;
uint8_t tx;
uint8_t status;
uint8_t padding[5];
} uart_t;
extern volatile uart_t uart0, uart1, uart2;
volatile uart_t *const uart[3] = { &uart0, &uart1, &uart2 };
#define uarts (sizeof uart / sizeof uart[0])
void uart_tx_all(uint8_t value) {
for (uint_fast8_t i = 0; i < uarts; i++)
uart[i]->tx = value;
}
With sufficient optimizations, uart_tx_all() compiles to three ldr+strb pairs on ARMv7e-m using Thumb, loading the structure address (directly using uart0, uart1, uart2 symbols; not through the uart array at all), storing value to offset 1 within each structure, plus a final bx lr.
Clearly, to write a value to three fixed memory locations it's enough to have one ldr (to load a base address) and three strb with various offsets (assuming it is all close together). And you would get that if you could use a pointer to the array of structs instead of the array of pointers to struct.
-
Clearly, to write a value to three fixed memory locations it's enough to have one ldr (to load a base address) and three strb with various offsets (assuming it is all close together). And you would get that if you could use a pointer to the array of structs instead of the array of pointers to struct.
No, because those structures were externally defined in the example, so their addresses are not known while compiling said code.
I took mikeselectricstuff's "Even if the current part maps them at consecutive addresses, other, future larger parts might not, e.g. on a fancy new part with 10 UARTS they decided to put UART6 onwards at a different start address because they ran out of space after UART5" as a requirement for the solution I suggested.
However, if you do have consecutive structures in memory, then specifying those in the linker script as a specific section makes most sense:
volatile struct uart_registers uart[NUM_UARTS] __attribute__((section ("uart.registers")));
but, used via a function-like macro,
#define UART(i) (uart)
so that you can always switch to an array of const pointers:
volatile struct uart_registers *const uart_ptr[4] = { &uart0, &uart1, &uart2, &uart3 };
#define UART(i) (*uart_ptr)
This way, access via UART(i).tx, UART(i).status et cetera can always be optimized to best possible machine code, given the circumstances.
As an example,
extern volatile uart_t uart_struct[3]; // Defined elsewhere, for example in linker script
volatile uart_t *const uart_ref[3] = { uart_struct+0, uart_struct+1, uart_struct+2 };
#define UARTS (sizeof uart_ref / sizeof uart_ref[0])
#define UART(i) (*uart_ref[i])
void uart_txall(uint8_t value) {
for (uint_fast8_t i = 0; i < UARTS; i++)
UART(i).tx = value;
}
with GCC for ARMv7e-m using AAPCS32 (arm-none-eabi) will generate just one load and three stores, plus return from subroutine.
That is, even though the structures appear to be accessed via an array of pointers, because the array is const, the compiler can optimize the access to the initializer values, and omit the array dereferencing altogether. Then, if the accessed structures are consecutive in memory, only one base address is loaded, with stores relative to that. This would work this way even if the members were not in the same order, or if there were multiple sub-arrays in the pointer array! Win-win, all around.
The function-like macro notation can be a bit confusing. I've watched mike's videos, and I know he's capable and this wouldn't be a problem, but I didn't want to immediately jump to this. I was tempted, though!
The idea behind the UART(number) macro is to evaluate either to a structure in an array of volatile uart structures, or to a dereferenced pointer in an array of const pointers to volatile structures. In both cases the notation is the same, UART(index).member. That is, UART(number) will always evaluate to the structure, and is perfectly valid even on the left side of the assignment. Of course, it is obvious that it is a function or function-like macro, so if anyone else reads the code, there should be a comment explaining this.
If your source files span multiple files, then the volatile uart_t *const uart_ref[3] = { uart_struct+0, uart_struct+1, uart_struct+2 }; initialization should be in a header file, in its entirety. It would NOT suffice for the declaration (omitting the part after the =) only to be in a header file, with the initialization in a C source file, as one would expect, exactly because the compiler has to see the initialization in each compilation unit to be able to fully optimize the UART(number) expressions through the const pointer array.
-
Let's say you have an architecture with two different type of UARTs, so you describe their register layout with struct uart_regs and struct usart_regs. Let's further say that their layouts are utterly incompatible, but they both have rx and tx fields, and you want to be able to use these in an array fashion.
In a header file, you declare the structure types, and then the UARTs/USARTs available. One common approach is to define their addresses in a linker script, and the number in a preprocessor macro:
#include <stdint.h>
struct uart_regs {
uint8_t rx;
uint8_t tx;
uint8_t status;
uint8_t padding[5];
};
struct usart_regs {
uint8_t status;
uint8_t padding[1];
uint8_t rx;
uint8_t tx;
};
#define UARTS 3
#define USARTS 2
extern volatile struct uart_regs uart_array[UARTS];
extern volatile struct usart_regs usart_array[USARTS];
In the same header file, declare the RX and TX arrays like this:
#define TXS (UARTS+USARTS)
static volatile uint8_t *const tx_ptr[TXS] = {
&(uart_array[0].tx),
&(uart_array[1].tx),
&(uart_array[2].tx),
&(usart_array[0].tx),
&(usart_array[1].tx),
};
#define TX(i) (*tx_ptr[i])
Again, the TX(number) macro expands to the tx structure member of the number'th uart or usart structure, and can be used on either side of an assignment (it is a valid "lvalue").
Note that because of the const, the compiler optimizes accesses through the pointer array and macro to direct accesses to the structures themselves, so that when you compile object files, those object files will not have a symbol named tx_ptr, nor any references to such.
So, while it may look like you get copies of that array in your object files, the const will save you from that, assuming you enable optimizations (-Os, -O1, -O2; -Og does not unroll the loop and will use the array, so with that you will get a copy of the array in each object file compiled with -Og that includes this header file).
Let's test the above:
void tx_all(uint8_t value) {
for (int i = 0; i < TXS; i++)
TX(i) = value;
}
Using GCC 5.4.1 through 14.2.0, -Os or -O2 -march=armv7e-m -mthumb (aapcs32, 32-bit arm-none-eabi), tx_all() compiles to
tx_all:
ldr r3, .L3
ldr r2, .L3+4
strb r0, [r3, #1]
strb r0, [r3, #9]
strb r0, [r3, #17]
strb r0, [r2, #3]
strb r0, [r2, #7]
bx lr
.L3:
.word uart_array
.word usart_array
which is basically as optimal as you can make this even if you wrote it by hand (except perhaps for the order of the above instructions), given that the address of uart_array and usart_array are not known yet at the time of this compilation.
Even if you reorder the elements in the tx_ptr array, the generated code stays optimal: only the store machine instruction order will change.
The purpose of the TX(number) macro is to simply hide the particular implementation detail, by doing the dereferencing for you. This allows you to modify the implementation, without having to modify any of the TX()-using code. We can discuss whether it is appropriate abstraction here, though. To me, it is just a tiny cost ("odd-looking" code, like assigning values to a function?) to save me from complicated search-and-replace process, if I decide to change the implementation behind it for any reason.
-
As an example,
extern volatile uart_t uart_struct[3]; // Defined elsewhere, for example in linker script
volatile uart_t *const uart_ref[3] = { uart_struct+0, uart_struct+1, uart_struct+2 };
#define UARTS (sizeof uart_ref / sizeof uart_ref[0])
#define UART(i) (*uart_ref[i])
void uart_txall(uint8_t value) {
for (uint_fast8_t i = 0; i < UARTS; i++)
UART(i).tx = value;
}
I would write it directly. Instead of:
extern volatile uart_t uart_struct[3]; // Defined elsewhere, for example in linker script
volatile uart_t *const uart_ref[3] = { uart_struct+0, uart_struct+1, uart_struct+2 };
#define UARTS (sizeof uart_ref / sizeof uart_ref[0])
#define UART(i) (*uart_ref[i])
I would do:
extern volatile uart_t uart_struct[3]; // Defined elsewhere, for example in linker script
#define UARTS (sizeof uart_struct / sizeof uart_t)
#define UART(i) (uart_struct[i])
I don't see any need for the array of pointers here. If they're not consecutive, the uart_struct definition would change and macros would change accordingly.
In practice, I would probably have my own structs for each UART where the pointer to the hardware struct would be a member, plus my own information, states, buffers, counters, and whatnot. If I wanted an array, I would make an array of these structs, or a linked list, or a queue, or whatever I would see fit.
-
I would write it directly.
Yep! Like I said, I am assuming the structures are not linear, or necessarily even all the same layout.
The macro UART(i) means that either way, the C code using UART(i) stays the same, and if compiler optimizations are enabled, you should get near-optimal machine code as a result.
In practice, I would probably have my own structs for each UART where the pointer to the hardware struct would be a member, plus my own information, states, buffers, counters, and whatnot.
For typical code, I as well.
If you can populate the array at initialization time, making such pointers volatile type *const name and initializing them at variable/object declaration time, in the unit where the functions manipulating the structure are, you can similarly avoid the indirect load and have the functions generate machine code accessing the pointed-to structure directly (using its symbol, not through this structure).
In other compilation units (where that initialization is not seen), the code will still load the pointer, once, but the compiler can cache it for any subsequent accesses.
If the UART may change at runtime, then the pointer member cannot be const-qualified anymore. (Type-punning the const-ness out, and replacing the pointer value, will lead to really odd bugs, so don't do that! It is because the unit where that const-ness and initialization value was seen at compile time will still use that initialization value, and the rest of the code the value from the structure... Sometimes being too clever really is a loaded foot-gun.)
I did think of an example where the array approach might be very useful: when creating an UART interfacing USB stick, with one serial endpoint being a control endpoint where you can use text commands –– terminal-like –– to configure the UARTs and other endpoints.
-
If you can populate the array at initialization time, making such pointers volatile type *const name and initializing them at variable/object declaration time, in the unit where the functions manipulating the structure are, you can similarly avoid the indirect load and have the functions generate machine code accessing the pointed-to structure directly (using its symbol, not through this structure).
Usually, the reason you use pointers is because you want the same code to work with different entities, different UART modules in this case. In such case the pointer cannot be optimized out.
I did think of an example where the array approach might be very useful: when creating an UART interfacing USB stick, with one serial endpoint being a control endpoint where you can use text commands –– terminal-like –– to configure the UARTs and other endpoints.
Very good example. For each converter, you would have a struct which contains pointers to endpoints, a pointer to UART, all the necessary settings, buffers, callback functions etc. The C functions would typically take that struct as a parameter. This way, the same code would work for all converters.
In OOP this would be called class instance.
-
You could put the UART registers in an array of structures. When you use C++ you have more options for "syntax sugar". Using C++ instead of C does not have a significant impact on code size as long as you keep to the right "sub set" of C++ (No exceptions, RTTI, be very careful (or avoid) templates). Several of my programs even became smaller when I switched to C++. The biggest disadvantage of C++ is that it's a (much) more complicated language. But I like the use of classes, which guides you into writing more readable and easier to maintain code. But these days I also do not do much programming anymore due to concentration problems.
With C++ you can write a class for a single uart, and then put the implementation specific initialization register names in the constructor. This is easy as long as all UART's have the same layout. If specific bits get swapped or moved to other registers for some of the UART's it becomes more difficult.
Also: There is a typo "Qustion for c experts" in the title of your topic.