The main quality issue is not the bit depth but the filtering used to eliminate RF. If the digital clock is 44.1kHz then that is only just over twice the audio range. That means your low-pass filter needs to be able to provide a substantial amount of high-cut in just one octave, to eliminate remnants of the clock frequency. Typically you need a compound active filter to achieve this, and the filter design itself can make or break the sound quality. In particular a poor filter design will introduce phase changes in the passband.
The solution is oversampling, which is basically a case of multiplying the digital clock so the filtering requirement is shifted further away from the audio range. That means a simple low-pass filter can be used, which is far less likely to introduce artifacts into the audio range. Even some of the expensive early CD players lacked oversampling, and that is why the sound was 'fuzzy' and unclear. Some manufacturers then started making it a point of 'specmanship' to use silly amounts of oversampling. In fact this gains no advantage. As long as the carrier is high enough in frequency to avoid the need for a filter with extremely sharp cutoff, that's all that's needed.
Incidentally, a 44.1kHz sampling rate means that a 10kHz signal consists of four points per cycle*. A 1kHz signal, 40. 100Hz bass, 400. Since 16-bit sample depth gives 65536 levels (or 32768 in each signal polarity) that is far more resolution than you can actually use, when you only have 400 points per cycle. Hence 32-bit encoding is pointless UNLESS a much higher clock rate is also used.
* The human ear cannot tell the difference between sine, triangle and square waves at above a few kHz. So, this is actually quite acceptable. Believe it or not.