On paper, NXPs LPC43xx (Cortex-M4 core) chips look like they could do it. There's "serial GPIO" (shift registers on steroids that supposedly can do any serial protocol), some sort of dedicated timer/state-machine hardware, and an event system that, IIRC, can trigger on anything you could possibly want and then initiate DMA transfers between peripherals without going through the CPU core. Plus I2S and a ton of other interfaces, if that is any help.
But trying to understand the documentation well enough to actually use all this is another matter. Perhaps it has improved recently, but I couldn't even understand where to begin. (Then again, I was just reading casually.)
Might be an option, or I might be completely off.