I am afraid that, as I mentioned earlier, these techniques seem not to work when the application (like of a sinewave generator) needs to generate frequencies in a relatively wide bandwidth; for example, from 20 KHz down to 650 Hz. But perhaps I am wrong.
A lot of the PWM filtering techniques are designed around the goal of creating an DC voltage with a very small to no ripple, and they do not help much if you want to make varying signals such as approximating a sine wave. Although that "cancel-pwm-dac-ripple-with-analog-subtraction-revisited" promises to be quite good with varying signal output too.
One method that always works of course is to use faster PWM, but you would need other hardware for that. microcontrollers that run at a few hundred MHz are no exception, and some have extra high resolution timers that can push the PWM frequency another few orders of magnitude higher.
If you want to stick with your AVR hardware, then upping it's frequency to the maximum helps. If you go from 8MHz to 16MHz, you double the PWM frequency, can also double the cutoff frequency of your RC filter, and that halves the ripple.
If you want to use some pulse shaping, pulse density, or dither technique, then abusing on of the peripherals is the most logical choice, and and SPI port is the most logical for an AVR. That way, you can let the peripheral shift out bits while the microcontroller is either fetching the next byte from a buffer or calculating the next set of 8 bits.
But if you want performance, the AVR's were quite nice microcontrollers 20 years ago, but there is so much much better (faster) stuff around, that trying to squeeze the last few percent of performance out of an AVR is not very practical. You can do it just for fun, or as a brain teaser how far you can push it, but if it's for the performance to cost ratio, or look at the hours it takes to get that extra bit of performance, then it's a waste of time.