One way used here is: a) Use only a generic subset of device capability and b), Divide the driver code into hardware specific and generic function layers. Things like io ports, timers, uarts and perhaps more, can be abstracted in that way. For example, the simplest case of io ports, use a lookup table, perhaps an array of structures, containing the addresses of the various registers, and their initialisation values, input, or output etc. Then force access indirectly with generic r/w drivers through the table. This works well if, for example, the port register set is not contiguous in memory, as was the case for a Renessas cpu used on one project. Write a generic set of read, write, byte or bit, whatever functions to access the ports. To change the port or even cpu, just edit / modify the table to suit. Define a table for each port, indexed with an enum port id. then perhaps another with all the setup for all the ports. Of course, there is a speed penalty, but current embedded devices have more than enough throughput so that it can be ignored most of the time. You can always write a driver to direct access the port for that special case. Same can be done for timers and uarts, if the drivers are written right. Gets a bit more complicated where interrupts are involved, but the isr code for that can be isolated into a separate cpu specific module. That approach has saved a load of effort here in the past and makes proven upper layer code much more reusable. Partitioning and layering are key and things like classic os design can be a great example on how to do it right, even for the smallest of projects...