You can trade an idle tone, or the spurs of multiple tones, for wideband noise, given a suitable coding (a random permutation of pulse patterns?), but you can't change the fact that the output is dithering back and forth and therefore must change by an average of one bit every N for a commanded step change of 1/N.
The average case can be quite good, for example 50% is 1 bit every 2, which can run at Fclk/2, but the worst case will always have codes that don't fit conveniently into a given period, and thus a low frequency component.
On a related subject, you can get considerably more average-case resolution from a PWMDAC by changing the PERIOD register at the same time. This is easily calculated, say given the limitation that the PERIOD register should sit in the range PERIOD_MAX/2 to PERIOD_MAX. (It doesn't need to, it could go down to PERIOD = 2 if you don't mind, but this does increase the relative error due to switching edges and settling, and draws more current too.) One method is to treat the input number (say a 16 bit value) as fractional (i.e., in taking range 0 to 65535/65536), and calculate the
convergents of that fraction. Stop iterating when the ratio is exact, or when the denominator exceeds PERIOD_MAX. If the denominator is less than PERIOD_MAX/2, shift it left (double it) until this is true.
I once calculated this, let me see... I think what I had was:
For fractions from 39000 to 43000 (/65536), PERIOD_MAX = 8000:
Max error: 0.32 LSB
Min error: 0, of course (there are 241 exact codes out of 4000 total, spaced in multiples of floor(65536/8000) or 8 apart)
RMS error: 0.0028 LSB (17.4 ENOB*)
Min iterations: 4 (iterations of the convergents algorithm)
Max iterations: 18
It may be, the restricted range was selected because ratios further from 0.5 have progressively worse performance (obviously enough -- you can't get a ratio of 1/65536 at all from a 13-bit counter!); or maybe out of laziness because plotting sixty five thousand goddamn points in Excel is a dumb idea.
*Effective number of bits, i.e., assuming the given setpoint is desired exactly. I've accounted for the narrow range spanned by these data (about 4 bits shy of a full range; the RMS resolution is closer to 21 bits absolute, but that would be unfair). Note this figure goes to infinity as PERIOD_MAX approaches 65536, because a 16-bit counter can perfectly reproduce every 16-bit fraction. The overall resolution is then limited by the fraction's quantization noise.
I suppose I should actually calculate it as the RMS between convergent error and input quantization error. But that wouldn't be very interesting because the quantization dominates in this case. Which actually implies a smaller PERIOD_MAX could be selected, if greater worst-case error is tolerable.
If I'm not mistaken, one iteration of the algorithm requires: 1 div, 2 mul, 6 add. I forget offhand if the division can be removed; otherwise, this is quite reasonable on most MCUs for a real-time system.
Tim