Author Topic: Maximum slew rate typically found in music/voice  (Read 4533 times)

0 Members and 1 Guest are viewing this topic.

Offline HalFosterTopic starter

  • Regular Contributor
  • *
  • Posts: 207
  • Country: us
Maximum slew rate typically found in music/voice
« on: September 15, 2023, 05:57:40 pm »
I came across a topic yesterday that had me thinking about the slew rates present in audio and the max/min required for processing equipment - specifically, how to fill in the blanks in the following statement:

The highest slew rates are in the vicinity of <slew rate> and are frequently found in the waveforms generated by <instrument/voice/etc>.

My first guess (admittedly a WAG) would be the initial hard strike of a cymbal or something along those lines. Does anyone have any specific numbers and/or references on this topic?

Let me clarify that I am not referring to typical audio equipment's bandwidth - more like in an orchestra or concert in real life.

Thanks,

Hal
--- If it isn't broken... Fix it until it is ---
 

Offline tom66

  • Super Contributor
  • ***
  • Posts: 6601
  • Country: gb
  • Electronics Hobbyist & FPGA/Embedded Systems EE
Re: Maximum slew rate typically found in music/voice
« Reply #1 on: September 15, 2023, 06:40:51 pm »
Would it matter - if the human ear can't typically hear above 20kHz, the maximum slew rate is a function of frequency, since any energy above that frequency is going to be ignored by the low pass filter that is the human auditory system.  There will be some audiophiles who claim that sampling above ~44kHz is necessary for one reason or another but AFAIK there's no scientific basis for those claims.

I'd imagine it's possible that striking a cymbal could produce frequencies well above 20kHz and therefore slew rates well above what we would see from e.g. voice or wind instruments.
 

Offline HalFosterTopic starter

  • Regular Contributor
  • *
  • Posts: 207
  • Country: us
Re: Maximum slew rate typically found in music/voice
« Reply #2 on: September 15, 2023, 06:43:23 pm »
Well, no in real life it wouldn't matter at all - in spite of what Audiophools might think.  It was just curiosity on my part, nothing more.

Hal
--- If it isn't broken... Fix it until it is ---
 

Offline tggzzz

  • Super Contributor
  • ***
  • Posts: 19028
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Maximum slew rate typically found in music/voice
« Reply #3 on: September 15, 2023, 06:47:03 pm »
The question is ill-formed. The slew rate is a function of frequency and amplitude.

A 1kHz 100V signal has a slew rate 10 times higher than that of a 1kHz 10V signal.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 
The following users thanked this post: tom66, Smokey, T3sl4co1l, Karel, BrianHG, SiliconWizard

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7589
  • Country: ca
Re: Maximum slew rate typically found in music/voice
« Reply #4 on: September 15, 2023, 07:00:06 pm »
The question is ill-formed. The slew rate is a function of frequency and amplitude.

A 1kHz 100V signal has a slew rate 10 times higher than that of a 1kHz 10V signal.
I was getting to that....
What if you have a 1kv or 1 megavolt driver amp?  What happens to the slew rate?

 

Online TimFox

  • Super Contributor
  • ***
  • Posts: 7867
  • Country: us
  • Retired, now restoring antique test equipment
Re: Maximum slew rate typically found in music/voice
« Reply #5 on: September 15, 2023, 08:08:45 pm »
The reason why slew rate is important for audio is that when the waveform's slew rate approaches the maximum possible rate at any point in the circuit, the feedback around the amplifier can no longer correct for the distortion.
Also, for a typical "op amp" circuit, the slew rate is determined by the maximum output current of the input differential stage flowing into the compensation capacitor across the second stage
   dV/dt = Imax/C ,
and the input stage's nonlinearity increases drastically as its output current approaches the maximum value.
Similarly, feedback cannot correct for a deadband in the output stage ("crossover distortion").
Without appropriate filtering before the input, an unwanted glitch or transient could cause a temporary excessive slew rate, which can interfere with the good stuff you want to hear until the amplifier recovers.
 

Offline Benta

  • Super Contributor
  • ***
  • Posts: 5783
  • Country: de
Re: Maximum slew rate typically found in music/voice
« Reply #6 on: September 15, 2023, 08:50:46 pm »
The reason why slew rate is important for audio is that when the waveform's slew rate approaches the maximum possible rate at any point in the circuit, the feedback around the amplifier can no longer correct for the distortion.
Very well phrased. It was a big audiophool issue in the 80s/90s carrying the buzzword "TIM" or "transient intermodulation distortion". The simple and universal fix is an RC filter on the input of the circuit as mentioned.
 
The following users thanked this post: Karel

Offline Smokey

  • Super Contributor
  • ***
  • Posts: 2467
  • Country: us
  • Not An Expert
Re: Maximum slew rate typically found in music/voice
« Reply #7 on: September 15, 2023, 08:58:29 pm »
Musical instruments in the real world don't just create a single frequency.  There are usually a complicated set of harmonics involved.  The higher order harmonics can interact to generate lower audible frequencies sometimes.  But like has been mentioned already, you can't hear above 20kHz so you don't actually care about anything higher than that in the absolute sense.  If you were to digitize the resultant frequencies up to 20kHz, you could recreate what the real world thing "sounded like" to a human even if you don't have the full bandwidth representation of the original thing.

So from that perspective it doesn't matter what happens above about 20kHz, even though some of that stuff is actually generating the frequencies you can hear.

 

Offline spostma

  • Regular Contributor
  • *
  • Posts: 117
  • Country: nl
Re: Maximum slew rate typically found in music/voice
« Reply #8 on: September 15, 2023, 09:04:53 pm »
I would simply write tool to analyze a number of typical professional audio recordings in WAV format.
 

Offline mzzj

  • Super Contributor
  • ***
  • Posts: 1236
  • Country: fi
 

Offline SeanB

  • Super Contributor
  • ***
  • Posts: 16249
  • Country: za
Re: Maximum slew rate typically found in music/voice
« Reply #10 on: September 15, 2023, 09:32:27 pm »
Pretty much most audio at some point or the other went through the venerable 741 opamp, or any of the clones, with the 1v/us slew rate it has. Works for low level audio, so pretty much should be a minimum, though better is not going to be easy to hear.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14048
  • Country: fr
Re: Maximum slew rate typically found in music/voice
« Reply #11 on: September 15, 2023, 09:39:01 pm »
The question is ill-formed. The slew rate is a function of frequency and amplitude.

A 1kHz 100V signal has a slew rate 10 times higher than that of a 1kHz 10V signal.

Obviously. To this I will add: a reasonable approximation for the minimum required slew rate would be to start with the classic relationship between bandwidth and rise time.
For a 10%-90% rise time Tr, and a bandwidth BW, it is approx.:

Tr = 0.35 / BW

If we take a typical "line" level for audio, amplitude in the order of 1V, so a p-p amplitude of 2V, we can approximate the 10%-90% section with 80% of the p-p amplitude, so 1.6V.
Let's call the p-p amplitude App.
The minimum required slew rate to accomodate a given bandwidth at the max amplitude given above would thus be (all values are in SI units, so App in V, BW in Hz and SR in V/s):

SR = 0.8 * App / Tr = 0.8 * App * BW / 0.35 = 2.29 * App * BW

With a typical 20kHz BW for audio and the typical amplitude mentioned above, that yields: SR = 2.29 * 2 * 20e3 = 91600 V/s = 0.0916 V/µs
At +/-10V you have 10 times that, so ~ 0.916V/µs.

Edit: Fixed a silly calculation mistake. We get a value within about 80% of the SR found classicaly from the derivative of a sine wave, which is perfectly normal since with the derivative method, you use the maximum of the signal's slope (the max of the derivative of the sine wave), while using the above rule of thumb, this is equivalent to using the slope of a line between the 10% and 90% points, considering sin(x) is relatively close to a straight line around the origin.
(Finding the equivalence between the two formally takes a little while in calculation, but it is, within a factor.)
As an extra exercise, you may consider what happens to a pure sine wave with the max frequency (= BW) if we limit the slew rate to the value calculated with the method above, rather than using the derivative at the origin.

« Last Edit: September 16, 2023, 02:00:38 am by SiliconWizard »
 

Online wraper

  • Supporter
  • ****
  • Posts: 16681
  • Country: lv
Re: Maximum slew rate typically found in music/voice
« Reply #12 on: September 15, 2023, 09:48:54 pm »
Would it matter - if the human ear can't typically hear above 20kHz, the maximum slew rate is a function of frequency, since any energy above that frequency is going to be ignored by the low pass filter that is the human auditory system.  There will be some audiophiles who claim that sampling above ~44kHz is necessary for one reason or another but AFAIK there's no scientific basis for those claims.

I'd imagine it's possible that striking a cymbal could produce frequencies well above 20kHz and therefore slew rates well above what we would see from e.g. voice or wind instruments.
Sampling frequency above that is needed because 2x sampling frequency from Nyquist's theorem does not really work IRL because real low pass filters are not perfect and thus you want more margin. 44kHz sampling works good enough not because it works fine for 20kHz but because even though humans can hear high frequencies, they do it poorly. And neither actual sound contains much in that part of spectrum.
« Last Edit: September 15, 2023, 09:52:27 pm by wraper »
 

Online TimFox

  • Super Contributor
  • ***
  • Posts: 7867
  • Country: us
  • Retired, now restoring antique test equipment
Re: Maximum slew rate typically found in music/voice
« Reply #13 on: September 15, 2023, 10:10:58 pm »
Pretty much most audio at some point or the other went through the venerable 741 opamp, or any of the clones, with the 1v/us slew rate it has. Works for low level audio, so pretty much should be a minimum, though better is not going to be easy to hear.

The 1 V/us slew rate of the 741 should be adequate for signal levels below, say, 1 V rms.
Note that before you see the tell-tale straight line on the output waveform, the differential input stage has gone non-linear (large-signal behavior of differential pair).
The decompensated 5534 op amp (Ccomp = 0) has a slew rate of 13 V/us, decreasing to 6 V/us with 22 pF, according to the 1994 Philips datasheet.
(22 pF is needed for unity-gain stability, but 0 works for higher closed-loop gains.)
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 5971
  • Country: fi
    • My home page and email address
Re: Maximum slew rate typically found in music/voice
« Reply #14 on: September 15, 2023, 11:18:54 pm »
Slew rate must be at least \$2 \pi f V_{pk}\$ for a sinusoidal signal of frequency \$f\$ and amplitude \$V_{pk}\$ to be reproduced perfectly.  This is well documented for amplifiers.

This is because the voltage of the signal with respect to time is
$$V(t) = V_{pk} \sin \left( 2 \pi f t \right)$$
and the slew rate is the magnitude of its derivative with respect to time,
$$\left\lvert \frac{d V(t)}{d t} \right\rvert = \left\lvert 2 \pi f V_{pk} \cos\left( 2 \pi f t \right) \right\rvert$$
which reaches the extrema when \$\cos\left(2 \pi f t \right) = \pm 1\$.  Therefore, the maximum slew rate is \$2 \pi f V_{pk}\$.

For \$f = 20000 \text{ Hz}\$ and \$V_{pk} = 1 \text{ V}\$, the slew rate must be at least \$125663.7 \text{ V/s} = 0.126 \text{ V/µs}\$.
« Last Edit: September 15, 2023, 11:25:49 pm by Nominal Animal »
 

Online gf

  • Super Contributor
  • ***
  • Posts: 1073
  • Country: de
Re: Maximum slew rate typically found in music/voice
« Reply #15 on: September 15, 2023, 11:56:47 pm »
...and the slew rate is the magnitude of its derivative with respect to time,

:-+ And I also don't see why the relation tr = 0.35 / BW would apply here, which is rather the 10%...90% risetime of the step response of a Gaussian filter with a given -3dB bandwidth. This waveform is not a sine wave, but the integral of a Gaussian bell.
 

Offline HalFosterTopic starter

  • Regular Contributor
  • *
  • Posts: 207
  • Country: us
Re: Maximum slew rate typically found in music/voice
« Reply #16 on: September 16, 2023, 12:21:03 am »
The question is ill-formed. The slew rate is a function of frequency and amplitude.

A 1kHz 100V signal has a slew rate 10 times higher than that of a 1kHz 10V signal.
I was getting to that....
What if you have a 1kv or 1 megavolt driver amp?  What happens to the slew rate?

That... is very true.  Instead of slew rate the term should have been rate of change or rather DV/DT.  I was thinking of how to get to a usable slew rate requirement and skipped a term.  Mea culpa.
--- If it isn't broken... Fix it until it is ---
 

Offline HalFosterTopic starter

  • Regular Contributor
  • *
  • Posts: 207
  • Country: us
Re: Maximum slew rate typically found in music/voice
« Reply #17 on: September 16, 2023, 12:29:29 am »
So a clearer question would be: In any average live musical or voice performance what would be the highest DV/DT of a natural waveform encountered and what instrument/voice would be the most likely to generate it.  Naturally ignoring any random interference patterns or such. 

I was looking for an answer like "A <some type of cymbal>  often generates a DV/DT of 200 V/uS on initially being struck." or somesuch.

I really wasn't looking to get too complicated with this, I just thought that someone, somewhere, would have done this research and might have published it somewhere. 

Hal
--- If it isn't broken... Fix it until it is ---
 

Offline tggzzz

  • Super Contributor
  • ***
  • Posts: 19028
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Maximum slew rate typically found in music/voice
« Reply #18 on: September 16, 2023, 12:53:26 am »
The question is ill-formed. The slew rate is a function of frequency and amplitude.

A 1kHz 100V signal has a slew rate 10 times higher than that of a 1kHz 10V signal.
I was getting to that....
What if you have a 1kv or 1 megavolt driver amp?  What happens to the slew rate?

That... is very true.  Instead of slew rate the term should have been rate of change or rather DV/DT.  I was thinking of how to get to a usable slew rate requirement and skipped a term.  Mea culpa.

Since slew rate is dV/dt, you haven't changed anything - the question is still ill-formed.

The necessary slew rate and dV/dt both depend on the waveform's amplitude
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21503
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Maximum slew rate typically found in music/voice
« Reply #19 on: September 16, 2023, 03:44:52 am »
Cymbals emit audio waves, not voltage; the question isn't meaningful.  Through a transducer system (microphone and speakers), the voltage and therefore slew rate or dV/dt is proportional to the peak level, which is arbitrary: it might be 3V at line level, it might be down in the noise floor (<1µV?) if turned down/muted, or it might be thousands of volts on a powerful loudspeaker system (or some specialty types like electrostatic speakers).

The question you meant to ask, is frequency response, and that is easily answered: the ear is sensitive to up to 20kHz or so, typically, so that's all that we care about.  Sound sources can go higher, but there isn't much point in preserving that information for human consumption at least.

And since such an oscillation can be full-scale, the slew rate still depends on line level or whatever, but several to low 10s of V/µs is typical for such equipment.

Tim
« Last Edit: September 16, 2023, 03:47:02 am by T3sl4co1l »
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Online TimFox

  • Super Contributor
  • ***
  • Posts: 7867
  • Country: us
  • Retired, now restoring antique test equipment
Re: Maximum slew rate typically found in music/voice
« Reply #20 on: September 16, 2023, 02:42:05 pm »
A serious suggestion for the mathematically inclined to answer the original question:
1.  Assume the "music/voice" to be analyzed is on a conventional audio CD.
2.  The 16-bit digital waveform, sampled at 44 sec-1, goes through a reconstruction filter (something like sinx/x) to give the audio output.
3.  Determine the code that gives you a maximum 1 kHz sine wave and define its reconstructed output as the reference level.
4.  Put the full-scale digital square wave (zero and max on alternating samples) into the same reconstruction filter and compute the slew rate at the output.
5.  Determine the code that gives you a 20 kHz square wave and do the same.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 5971
  • Country: fi
    • My home page and email address
Re: Maximum slew rate typically found in music/voice
« Reply #21 on: September 16, 2023, 06:05:51 pm »
Let's expand what TimFox described.

Conventional audio CD's contain metadata (including error correction), and uncompressed pulse-code modulated 16-bit stereo (two channel) data sampled at 44100 samples per second.  The maximum slew rate is therefore 2¹⁶ = 65536 quantization steps per one 44100'th of a second, or 65536×44100 = 2890137600 steps/second ≃ 2.89 quantisation steps per nanosecond.

If the audio CD data contains an alternating sample sequence (-32768, +32767, -32768, +32767, ...), per Nyquist-Shannon theorem, it should be reconstructed as a perfect 22050 Hz sine wave with maximum amplitude.  This leads to the \$2 \pi f V_{pk}\$ (here, \$138544 \, V_{pk}\$ per second, or \$0.1385 \, V_{pk}\$ per microsecond) minimum slew rate required for both the incoming circuitry before the ADC and the output amplifier circuitry after the DAC.

Does the DAC have more stringent slew rate requirements?  Slewing from rail to rail in a single sample period is \$2 \, V_{pk}\$ in one 44100'ths of a second, or \$88200 \, V_{pk}\$ per second, which is less than the aforementioned reconstruction limit.  This means that with a theoretically perfect brick-wall low-pass filter, the \$0.1385 \, V_{pk}\$ slew rate suffices, but this slew rate is \$1.5707\$ times (\$\pi/2\$) as fast as just slewing from rail to rail in a single sample period.  This affects our choice of DACs, as just being able to slew from rail to rail in a single sample period is not sufficient; it needs to slew basically 1.5707 times rail-to-rail range, in a single sample period.

If we did a Fourier analysis of the error spectrum at different (higher than necessary) DAC slew rates, we'd find a faster slew rate does push the error noise somewhat higher in the output spectrum, which makes it easier to filter out this particular error using analog circuits.  Note that we're still assuming a brick-wall low-pass filter at 22050 Hz for reconstructing the highest-frequency components in the audio signal, though.

For the ADC, there is no "slew rate" requirement as such, if we assume instantaneous sampling in time, or integration of the input signal for the duration of each sample.  Existing ADCs differ, but their frequency response is known, and can be compensated using filters prior to the ADC.  The main problem is that for perfect signal capture, we'd again need a brick wall low-pass filter at 22050 Hz, and AC coupling (i.e. rejecting the DC component).  In practice, human hearing does not go below 10 Hz - 20 Hz, so signals below 10 Hz can be rejected also, giving us a brick-wall 10 Hz to 22050 Hz band-pass input filter requirement.

Here we get into the realm that will forever weird out Audiophōles.  If we replace the exact DAC with an oversampled dithering DAC, we can push all quantization noise to higher frequencies, so that we don't need a brick wall filter; a more realistic low-pass filter can reconstruct our original signal perfectly.

Looking up delta-sigma modulation shows how these are done in practice, noting that the intermediate steps (within the modulation scheme) do require much higher clock rates than the sample rate used.  It is also useful to note that single-bit delta-sigma modulation is pulse-density modulation, exactly.

It is also useful to understand what kind of voltages are used to process and transfer audio signals.  The most common standard is line level, which has \$V_{pk}\$ of 1.414 V at 0 dBu (decibels unloaded) in "consumer" devices, and 1.095 V at 0 dBu in "pro" devices, with signals clipped to somewhere between ±1.5V and ±2.0V.  Thus, an initial assumption of \$V_{pk} \approx 1.4 \text{ V}\$ for consumer line-level audio that does not clip, is sensible.

Combining all of the above, and a rough estimate of \$f = 20 \text{ kHz}\$ for the highest frequency we humans care about, we can say that using line levels, the maximum slew rate needed is \$2 \pi f V_{pk} \approx 0.2 \text{ V/µs}\$.

To understand the range in slew rates we should consider, let's consider superhuman hearing that can detect components up to \$f = 25 \text{ kHz}\$, and a fully clipping signal with say \$V_{pk} = 2 \text{ V}\$.  The slew rate we get for this is \$\approx 0.31 \text{ V/µs}\$.

Because we are talking about stereo audio, however, we do need to consider the one oddity about human hearing: time discrimination.  Humans can detect audio signal time separation down to 10 µs, which corresponds to 100 kHz.  That is, because of the exact mechanism of human hearing (which is very much a spectrum analyzer, rather than time-domain sampling), humans can detect much smaller time delays than the maximum frequency they can hear.  For engineers, the time-domain discrimination for changes in the spectrum detected in each ear is 10 µs.  In turn, this does mean that even though 20 Hz .. 20 kHz bandwidth per ear suffices, we may need much higher bandwidth to properly represent 3D audio effects, because of the out extreme time-domain discrimination ability!

Which also explains why 192 kHz and audio sampling, even when band-filtered to say 10 Hz ... 20 kHz, can produce superior stereo/3D audio experience.  Of course, that only really applies when the speaker configuration matches the microphone configuration, preferably a human head acoustic model with earlobes and all.

It also turns out that most 3D effects do not rely on high time-domain discrimination at all, but more on spectrum shaping; basically, our earlobes and the shape of our head causes sound spectra to be filtered differently based on their direction, with the time-domain separation being just "fine tuning" on top of that.  You can investigate and experiment on this further by looking into the open source OpenAL library.
« Last Edit: September 16, 2023, 06:08:31 pm by Nominal Animal »
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26565
  • Country: nl
    • NCT Developments
Re: Maximum slew rate typically found in music/voice
« Reply #22 on: September 16, 2023, 06:15:41 pm »
Would it matter - if the human ear can't typically hear above 20kHz, the maximum slew rate is a function of frequency, since any energy above that frequency is going to be ignored by the low pass filter that is the human auditory system.  There will be some audiophiles who claim that sampling above ~44kHz is necessary for one reason or another but AFAIK there's no scientific basis for those claims.
The problem is that -like with any filter- you'll see / hear distortions happen at much lower frequencies. Brickwall filtering at 20kHz gives nasty effects as well. So yes, for excellent audio quality you'll need to sample at much higher frequencies. Some of the higher end audio amplifiers have bandwidths up to 200kHz or more in order to have the lowest phase shift in the audio band. In the end a CD is pretty bad where it comes to frequency and dynamic range.

Because we are talking about stereo audio, however, we do need to consider the one oddity about human hearing: time discrimination.  Humans can detect audio signal time separation down to 10 µs, which corresponds to 100 kHz.  That is, because of the exact mechanism of human hearing (which is very much a spectrum analyzer, rather than time-domain sampling), humans can detect much smaller time delays than the maximum frequency they can hear.  For engineers, the time-domain discrimination for changes in the spectrum detected in each ear is 10 µs.  In turn, this does mean that even though 20 Hz .. 20 kHz bandwidth per ear suffices, we may need much higher bandwidth to properly represent 3D audio effects, because of the out extreme time-domain discrimination ability!
No! You don't need a higher samplerate to phase shift a signal by a small amount. In case of audio you need a higher samplerate /bandwidth to preserve the frequency/phase response better. There is a slight difference there.
.
« Last Edit: September 16, 2023, 06:55:05 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 
The following users thanked this post: newbrain

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 5971
  • Country: fi
    • My home page and email address
Re: Maximum slew rate typically found in music/voice
« Reply #23 on: September 16, 2023, 09:12:36 pm »
No! You don't need a higher samplerate to phase shift a signal by a small amount.
That statement makes no sense.

First, phase shift of a discretized signal is not exactly the same as delaying it, because the wave still starts at sampling window boundary, at best at a sample  boundary, in the reconstituted signal.  As it is the leading edge of a wave packet that is detected, the sinusoidal phases start at zero degrees, and shifting it to nonzero causes a leading error in the reconstituted signal, which can be audible as a high-frequency "click"-type noise.

That is, it is true that for a band- or low-pass limited signal, the higher time resolution does correspond to greater phase resolution, mathematically.  For a discretized signal, phase shift is problematic at the leading wavefront; and the leading wavefront is what is involved in the time discrimination here.

Second, phase resolution in a discrete signal is solely determined by the sample rate \$f_s\$ and sinusoidal component frequency \$f\$: \$360° \, f / f_s\$.  At the Nyquist-Shannon limit of half the sample rate, there are only two possible phases, 0° and 180°, because the discrete signal consists of alternating values.  At quarter the sample rate, there are four possible phases, and so on.  Phase resolution in a discrete signal is therefore always inversely proportional to frequency.

In case of audio you need a higher samplerate to preserve the frequency/phase response better.
That's not exactly it, though: for mono audio, none of this matters.
We only need a higher samplerate to preserve phase resolution of the difference of the signal to each ear at effective frequencies.

Consider a stereo audio signal
$$\begin{cases}
L(t) = 0, & t \lt 0 \text{ or } t \gt 1 \\
R(t) = 0, & t \lt \tau \text{ or } t \gt 1+\tau \\
L(t) = (1 - t)^2 \sin\bigr(2 \pi f t\bigr), & 0 \le t \le 1 \\
R(t) = (1 + \tau - t)^2 \sin\bigr(2 \pi f (t - \tau)\bigr), & \tau \le t \le 1+\tau \\
\end{cases}$$
where \$\tau\$ is the delay on arrival for the right channel, \$t\$ is time, and \$f\$ the fundamental frequency.  For such a signal with sharp leading edge or peak, it is \$\tau\$ that has the 10 µs resolution, even though our hearing is limited to approximately \$20 \text{ Hz } \le f \le 20000 \text{ Hz}\$ or so (as if we had 50 µs \$t\$ sampling intervals).

If we stored our audio as \$S(t) = \bigl(L(t)+R(t)\bigr)/2\$ and \$D(t) = \bigl(L(t) - R(t)\bigr)/2\$, only the latter (difference) would need the higher time resolution (sample rate, with both having the same bandwidth).  To reconstitute, we'd need to use the higher time resolution for both channels, and apply \$L(t) = S(t) + D(t)\$ and \$R(t) = S(t) - D(t)\$.

For a discrete audio signal, phase is an useful mathematical tool, though.  For example, the peak sensitivity for human hearing is a bit below 4000 Hz; let's approximate it as 3675 Hz, or 1/12th of the CD audio sample rate of 44100 samples per second.  This means that at that frequency, the phase resolution of CD audio is 360°/12 = 30°.  If we produce two sinusoidal signals at 3675 Hz (each full wave taking about 272 µs), one for each ear, both starting at zero phase, but one minutely delayed, humans can generally discriminate at down to 10 µs difference.  That corresponds to a 360° × 10µs / 272µs ≃ 13° phase difference at 3675 Hz.  Thus, CD stereo audio does not have sufficient phase resolution at 3675 Hz to match human hearing.

Yet, phase does not convey the correct underlying concept, and may lead to problems –– for example, the case above, initial samples in a phase-shifted wave packet leading edge, that tripped nctnico too.

Humans are born with about 3500 hair cells in each ear, with each cell having a bundle of 50 to 100 hairs, stereocilia, which sense a specific frequency range.  We can think of each ear as a spectrum analyzer with about 3500 channels (with each channel consisting of a few dozen frequency samplers within its range).  Our brain can detect down to 10 µs delays between the activation of a pair of corresponding channels in each ear.  Therefore, instead of pure sinusoidal signals, it is better to use the concept of wave packets –– those with a steep leading edge or peak, for arrival time discrimination.  In some sense, the "spectrum analyzer" does 100,000 spectrums per second; but this isn't exactly correct either, as most changes are detected at much, much lower rate.

Overall, we're deep into human physiology and psychoacoustic model here.
« Last Edit: September 16, 2023, 09:18:44 pm by Nominal Animal »
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26565
  • Country: nl
    • NCT Developments
Re: Maximum slew rate typically found in music/voice
« Reply #24 on: September 16, 2023, 11:43:48 pm »
No! You don't need a higher samplerate to phase shift a signal by a small amount.
That statement makes no sense.

First, phase shift of a discretized signal is not exactly the same as delaying it, because the wave still starts at sampling window boundary, at best at a sample  boundary, in the reconstituted signal.  As it is the leading edge of a wave packet that is detected, the sinusoidal phases start at zero degrees, and shifting it to nonzero causes a leading error in the reconstituted signal, which can be audible as a high-frequency "click"-type noise.

That is, it is true that for a band- or low-pass limited signal, the higher time resolution does correspond to greater phase resolution, mathematically.  For a discretized signal, phase shift is problematic at the leading wavefront; and the leading wavefront is what is involved in the time discrimination here.

Second, phase resolution in a discrete signal is solely determined by the sample rate \$f_s\$ and sinusoidal component frequency \$f\$: \$360° \, f / f_s\$.  At the Nyquist-Shannon limit of half the sample rate, there are only two possible phases, 0° and 180°, because the discrete signal consists of alternating values.  At quarter the sample rate, there are four possible phases, and so on.  Phase resolution in a discrete signal is therefore always inversely proportional to frequency.
You are forgetting the actual sample values / resolution. You can sample a sine wave at a random phase and still end up with a sine wave. If your DAC resolution is infinite, then your phase resolution will also be infinite. And this holds true up to (not at) the Nyquist-shannon limit.
« Last Edit: September 17, 2023, 12:06:37 am by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 
The following users thanked this post: langwadt, newbrain


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf