Thanks all, the answers in this thread have thrown up a few things to think about (which was the point of the post really), probably the most interesting is using a STM32 as it could do other things as a slave MCU.
No no, not "slave MCU"; there are legitimate cases for a multi-mcu system, but I'm 99% sure this isn't one of them. Just do everything with one MCU/CPU, get rid of moving all that data around!
this is the tail end of an ongoing design which is using a Nordic nRF52832 BLE MCU.
"built around an Arm® Cortex™-M4 CPU with floating point unit running at 64 MHz"
While I can't guarantee this 100%, I
think it should be able to software decode just fine. Have you tried?
Is there an option to upgrade to a more capable Nordic MCU, compatible with the existing code with minor changes?
https://www.helixcommunity.org/projects/datatype/Mp3dec is one option (just googled it), the performance numbers look like it should be able to do it. RAM seems a bit tight resource with this particular implementation, though.
@Siwastaja - I was originally using a 12bit DAC which worked OK, but I wasn't too happy with the audio, as the tail end of some sibilants was sounding a bit raspy, so I changed to 16bit. I suppose using a 2nd MCU I'd have plenty of GPIO to play with so could easily implement a R-2R DAC.
No, R2R DAC is definitely much worse than an MCU-integrated 12-bit DAC. If the integrated 12-bitter isn't enough, use a proper audio DAC.
12-bit audio has been succesfully used in recording and distribution, a classic C cassette is around equivalent 6-7 bits and just fine for speech, just a bit noisy.
I'm 99% sure the raspiness of the sibilants was caused by something else than lack of bits.
Did you have any filtration after the DAC? They can settle quite quickly, causing steps. A basic RC could do wonders.
Unless the purpose of the speech is to be listened using audiophile gear, in well sound-insulated room, or high-quality headphones, for delivering "ear orgasms" of sorts, I bet an integrated 12-bit DAC is enough once you figure out what's degrading the result.