Author Topic: Spectral Leakage in DFT / FFT (Read 6414 times)

hamster_nz · « **on:** October 24, 2016, 09:52:24 am »

I've been playing with FFTs to see if I can contribute something useful to The Cacophony Project ( https://cacophony.org.nz/ ), and just spent a night looking in the wrong place for the answer. I've been working on a high quality audio processing to allow raw recording to be down-sampled before processing (eventually from 48kHz down to 8kHz to feed into voice detection).

The first logical step is to low-pass filter to just below 1/2 the target sample rate, before it gets decimated. So to see how good my filter was or wasn't working I've also been running a DFT on the intermediate output, but was getting what looks like a whole lot of spurious high-frequency artifacts that shouldn't be there. So after trying to debug the filter it turns out that the issue was actually in the DFT!

If your data has any discontinuity between the start and end, that 'jump' splats 'spectral leakage' (https://en.wikipedia.org/wiki/Spectral_leakage) all across your output. Apparently one solution to this is to window the data, so the first and last samples are always zero, avoiding any jump.

Anyhow, this code isn't supposed to be an optimal FFT or DFT or anything, but pretty much shows what to do...

Code: [Select]

void fft_1024(int *sample, int *output) {
    int f, i;
    static double window[1024];

    /* Window the data before the DFT */
    for(i = 0; i < 1024; i++) {
       window[i] = (0.42 - 0.5*cos(2*PI*i/1023) + 0.08*cos(4*PI*i/1023)) * sample[i];
    }

    /* A poorly implemented DFT */
    for(f = 0; f < 512; f++) {
        double total_sin = 0.0, total_cos= 0.0;
        for(i = 0; i < 1024; i++)
        {
            double angle = (f*i%1024)/512.0*PI;
            total_sin += sin(angle)*window[i];
            angle = ((f*i+256)%1024)/512.0*PI;
            total_cos += sin(angle)*window[i];
        }

        total_sin /= 1024.0;
        total_cos /=1024.0;
        /* Scale to give approximately 0-255 values in the output for graphing */
        total_sin /= 4.0;
        total_cos /= 4.0;
        output[f] = sqrt(total_sin*total_sin+total_cos*total_cos);
    }
    return 0;
}

Hope it helps somebody someday...

chris_leyson · « **Reply #1 on:** October 24, 2016, 10:18:26 am »

I once worked or a company that had an FFT ASIC made for Doppler ultrasound and because they couldn't decide which window function to use they left it out

Needless to say the performance wasn't as good as it could have been, bunch of plonkers.

coppice · « **Reply #2 on:** October 24, 2016, 10:23:59 am »

Windowing is a fundamental aspect of most DSP. If you don't understand why you need to window, and the limitations of windowing, at the mathematical level, you don't really have the skills needed for any DSP work.

Why would you downsample to 8k samples/second when processing bird song? For many birds a lot of the energy is well above 4kHz. If you also want to process, say, insect sounds you often find that 100% of the energy is above 4kHz. What's wrong with processing the 48k samples/second signal? Its not a very high data rate for modern hardware to handle.

snarkysparky · « **Reply #3 on:** October 24, 2016, 01:06:53 pm »

If the data has discontinuities ( dropouts ) then this is really the original data multiplied by a magnitude one square wave which is zero where the dropouts occur.
The FFT of this is then the convolution of the original data frequency spectrum and the square wave frequency spectrum. The square wave spectrum can be limited at the high frequency points by rounding each transition gradually with a window. After this the convolution of the two spectrums will be shorter so to speak.

coppice · « **Reply #4 on:** October 24, 2016, 01:38:53 pm »

Quote from: snarkysparky on October 24, 2016, 01:06:53 pm

If the data has discontinuities ( dropouts ) then this is really the original data multiplied by a magnitude one square wave which is zero where the dropouts occur.
The FFT of this is then the convolution of the original data frequency spectrum and the square wave frequency spectrum. The square wave spectrum can be limited at the high frequency points by rounding each transition gradually with a window. After this the convolution of the two spectrums will be shorter so to speak.

You seem to be implying that a DFT or FFT gives you have something which is zero everywhere except where your chunk of non-zero samples are. A DFT or FFT is actually like taking a chunk out of the infinite Fourier sequence, and wrapping its ends together.

Kalvin · « **Reply #5 on:** October 24, 2016, 01:45:45 pm »

You may want to take a look at this freely available book "SPECTRAL AUDIO SIGNAL PROCESSING":
https://www.dsprelated.com/freebooks/sasp/

hamster_nz · « **Reply #6 on:** October 24, 2016, 05:35:44 pm »

Quote from: coppice on October 24, 2016, 10:23:59 am

Why would you downsample to 8k samples/second when processing bird song? For many birds a lot of the energy is well above 4kHz. If you also want to process, say, insect sounds you often find that 100% of the energy is above 4kHz. What's wrong with processing the 48k samples/second signal? Its not a very high data rate for modern hardware to handle.

One of the first things required is a way to prevent accidentally recording & archiving data that contains human voices (yeah, I know it is odd, but people might be walking though a forest and revealing their evil plans and their privacy should be respected), so I am downsampling to 8k to test feeding into a Voice Activity Detection filter like the ones in conference phones to allow me to mask out areas that are believed to contain intelligible speech. Most of these have been developed for about 4k audio bandwidth needed for telephony systems.

I'm also going to try to adapt some machine learning to see if I can flag sections that are atypical for the changing background noise field, and spectra of 48k audio feels overkill for that.

nctnico · « **Reply #7 on:** October 24, 2016, 05:50:43 pm »

I'd go a different route. IMHO FFT isn't very usefull in tone decoding / rythm detection algorithms. Tones may fall between FFT bins so the level gets lost and you are creating a lot of results you don't need. I'd start with band filtering and go from there. A very effective way to detect frequencies is a frequency counter. This gives you frequency/level detectors with infinitely sharp band filtering and low latency. Band filtering can be done with (elliptic) IIR filters which don't need much computational resources.

coppice · « **Reply #8 on:** October 24, 2016, 05:53:22 pm »

Quote from: hamster_nz on October 24, 2016, 05:35:44 pm

Quote from: coppice on October 24, 2016, 10:23:59 am
Why would you downsample to 8k samples/second when processing bird song? For many birds a lot of the energy is well above 4kHz. If you also want to process, say, insect sounds you often find that 100% of the energy is above 4kHz. What's wrong with processing the 48k samples/second signal? Its not a very high data rate for modern hardware to handle.

One of the first things required is a way to prevent accidentally recording & archiving data that contains human voices (yeah, I know it is odd, but people might be walking though a forest and revealing their evil plans and their privacy should be respected), so I am downsampling to 8k to test feeding into a Voice Activity Detection filter like the ones in conference phones to allow me to mask out areas that are believed to contain intelligible speech. Most of these have been developed for about 4k audio bandwidth needed for telephony systems.

I'm also going to try to adapt some machine learning to see if I can flag sections that are atypical for the changing background noise field, and spectra of 48k audio feels overkill for that.

Why not run the VAD at 48k? Most VAD algorithms are pretty iffy, even when run at sample rates higher than 8k.

A lot of birds emphasise frequencies above 4kHz - small heads generally create high pitches. I would be really concerned about cutting off everything above 4kHz.

ehughes · « **Reply #9 on:** October 25, 2016, 03:36:31 pm »

Quote

Apparently one solution to this is to window the data

To be technically correct, you are always "windowing" data even when you do nothing. Simply grabbing a block of data means that you effectively multiplying by a "rectangular" window. There is no such thing as a block of data with "no window"

Any window has its own DFT (Rectangular, Hanning, etc). Since all sampled blocks of data in time domain are effectively multiplied some some window, the spectral leakage you see is the *convolution* of the windows's DFT and the true signal's DFT in the Frequency domain.

In the case if the rectangular window, its spectrum is a sync function.

All windows have this effect. To see it, take the DFT of your window and you will get the profile. If you apply that window to say a sine wave, the DFT will result in the window being *shifted* to the bin of the frequency of your sine way. (convolution of a delta function [sine wave] in the frequency domain can mathematically be shown to be a shift).

For a complicated signal, you end up with a spectrum that is the convolution of the spectrum of the window and the spectrum of *real* signal.. The assumption is that the signal is periodic beyond the boundaries of the window.

Quote

I'd go a different route. IMHO FFT isn't very usefull in tone decoding / rythm detection algorithms. Tones may fall between FFT bins so the level gets lost and you are creating a lot of results you don't need. I'd start with band filtering and go from there. A very effective way to detect frequencies is a frequency counter. This gives you frequency/level detectors with infinitely sharp band filtering and low latency. Band filtering can be done with (elliptic) IIR filters which don't need much computational resources.

If single tones are what you want, then you need Goertzel's Algorithm:

https://en.wikipedia.org/wiki/Goertzel_algorithm

It is the most compact way to approach the problem. You can pull out Frequency and phase as needed. (It is essentially a complex valued IIR filter).

nctnico · « **Reply #10 on:** October 25, 2016, 03:49:39 pm »

The only problem is that Goertzel's Algorithm often isn't suitable for the task at hand. And it won't work for detecting birds because Goertzel's Algorithm is too narrow. It basically is a single bin FFT.

ehughes · « **Reply #11 on:** October 25, 2016, 06:23:46 pm »

Quote

The only problem is that Goertzel's Algorithm often isn't suitable for the task at hand. And it won't work for detecting birds because Goertzel's Algorithm is too narrow. It basically is a single bin FFT.

You have complete control over width of the bin. It is directly related to the block size of your computation. Small blocks == Wide Bins. If you need phase information than it is a good way do go. If not, then a filter is fine.

coppice · « **Reply #12 on:** October 27, 2016, 03:46:42 am »

Quote from: nctnico on October 25, 2016, 03:49:39 pm

The only problem is that Goertzel's Algorithm often isn't suitable for the task at hand. And it won't work for detecting birds because Goertzel's Algorithm is too narrow. It basically is a single bin FFT.

A Goertzel transform is literally one bin of an Fourier transform, with all the good and bad which that implies. You have the same need to window to mitigate spectral leakage. You have the same ability to choose the number of bins to control their width.

hamster_nz · « **Reply #13 on:** October 27, 2016, 04:26:52 am »

Quote from: coppice on October 27, 2016, 03:46:42 am

Quote from: nctnico on October 25, 2016, 03:49:39 pm
The only problem is that Goertzel's Algorithm often isn't suitable for the task at hand. And it won't work for detecting birds because Goertzel's Algorithm is too narrow. It basically is a single bin FFT.
A Goertzel transform is literally one bin of an Fourier transform, with all the good and bad which that implies. You have the same need to window to mitigate spectral leakage. You have the same ability to choose the number of bins to control their width.

The Goertzel algorithm seems to be much like a moving average FIR filter, but using complex values. That seems to me to make it incompatible with windowing, because your samples need to have different weights depending on when the samples were taken....

coppice · « **Reply #14 on:** October 27, 2016, 05:40:38 am »

Quote from: hamster_nz on October 27, 2016, 04:26:52 am

Quote from: coppice on October 27, 2016, 03:46:42 am
Quote from: nctnico on October 25, 2016, 03:49:39 pm
The only problem is that Goertzel's Algorithm often isn't suitable for the task at hand. And it won't work for detecting birds because Goertzel's Algorithm is too narrow. It basically is a single bin FFT.
A Goertzel transform is literally one bin of an Fourier transform, with all the good and bad which that implies. You have the same need to window to mitigate spectral leakage. You have the same ability to choose the number of bins to control their width.

The Goertzel algorithm seems to be much like a moving average FIR filter, but using complex values. That seems to me to make it incompatible with windowing, because your samples need to have different weights depending on when the samples were taken....

Where did you get the idea that Goertzel is like a moving average FIR filter? An FIR process can be used progressively. Goertzel is an IIR process, and is a block processing algorithm.

You will find documents on the web about sliding Goertzel algorithms. These add a buffer to record the working data for the duration of the processing block. With this you can run the algorithm continuously, without the reset and restart operations you normally need to perform as you start the Goertzel algorithm. Basically, you record the effect of a sample going into the processing, and subtract its effect as the sample passes out of the processing window.

Goertzel really does calculate one bin of a Fourier transform. It is frequently used without windowing, where the system can to live with the resulting spectral mess. This is often possible, as you are only calculating one bin, and that bin may not be overly influenced by the leakage in a particular application. If you do need to deal with the spectral mess, you need to window.

hamster_nz · « **Reply #15 on:** October 27, 2016, 08:38:13 am »

Quote from: coppice on October 27, 2016, 05:40:38 am

Quote from: hamster_nz on October 27, 2016, 04:26:52 am

Quote from: coppice on October 27, 2016, 03:46:42 am
Quote from: nctnico on October 25, 2016, 03:49:39 pm
The only problem is that Goertzel's Algorithm often isn't suitable for the task at hand. And it won't work for detecting birds because Goertzel's Algorithm is too narrow. It basically is a single bin FFT.
A Goertzel transform is literally one bin of an Fourier transform, with all the good and bad which that implies. You have the same need to window to mitigate spectral leakage. You have the same ability to choose the number of bins to control their width.

The Goertzel algorithm seems to be much like a moving average FIR filter, but using complex values. That seems to me to make it incompatible with windowing, because your samples need to have different weights depending on when the samples were taken....
Where did you get the idea that Goertzel is like a moving average FIR filter? An FIR process can be used progressively. Goertzel is an IIR process, and is a block processing algorithm.

You will find documents on the web about sliding Goertzel algorithms. These add a buffer to record the working data for the duration of the processing block. With this you can run the algorithm continuously, without the reset and restart operations you normally need to perform as you start the Goertzel algorithm. Basically, you record the effect of a sample going into the processing, and subtract its effect as the sample passes out of the processing window.

If you substitute "moving average FIR" for "Goertzel" in the in the second paragraph, it reads perfectly true to me:

You will find documents on the web about moving average FIR algorithms. These add a buffer to record the working data for the duration of the processing block. With this you can run the algorithm continuously, without the reset and restart operations you normally need to perform as you start the moving average FIR algorithm. Basically, you record the effect of a sample going into the processing, and subtract its effect as the sample passes out of the processing window.

Of course the result is different (a high pass or low pass filter vs detecting a small band of frequencies), but with a sliding window Goertzel you can't efficiently window the data - you are pretty much forced to recalculate the whole block when you move slide window.

coppice · « **Reply #16 on:** October 27, 2016, 09:47:08 am »

Quote from: hamster_nz on October 27, 2016, 08:38:13 am

with a sliding window Goertzel you can't efficiently window the data - you are pretty much forced to recalculate the whole block when you move slide window.

Is there any sliding window algorithm that would support windowing?

hamster_nz · « **Reply #17 on:** October 27, 2016, 08:14:25 pm »

Quote from: coppice on October 27, 2016, 09:47:08 am

Quote from: hamster_nz on October 27, 2016, 08:38:13 am
with a sliding window Goertzel you can't efficiently window the data - you are pretty much forced to recalculate the whole block when you move slide window.
Is there any sliding window algorithm that would support windowing?

Triangular windowing should be possible , and could easily replace 'n' multiplications with appoximately 'n/2' additions and 'n/2' subractions.

A more well though out implementation might replace it with a handful of addtions and subtractions per sample.

https://en.wikipedia.org/wiki/Window_function#Triangular_window


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Spectral Leakage in DFT / FFT (Read 6414 times)

hamster_nz

Spectral Leakage in DFT / FFT

chris_leyson

Re: Spectral Leakage in DFT / FFT

coppice

Re: Spectral Leakage in DFT / FFT

snarkysparky

Re: Spectral Leakage in DFT / FFT

coppice

Re: Spectral Leakage in DFT / FFT

Kalvin

Re: Spectral Leakage in DFT / FFT

hamster_nz

Re: Spectral Leakage in DFT / FFT

nctnico

Re: Spectral Leakage in DFT / FFT

coppice

Re: Spectral Leakage in DFT / FFT

ehughes

Re: Spectral Leakage in DFT / FFT

nctnico

Re: Spectral Leakage in DFT / FFT

ehughes

Re: Spectral Leakage in DFT / FFT

coppice

Re: Spectral Leakage in DFT / FFT

hamster_nz

Re: Spectral Leakage in DFT / FFT

coppice

Re: Spectral Leakage in DFT / FFT

hamster_nz

Re: Spectral Leakage in DFT / FFT

coppice

Re: Spectral Leakage in DFT / FFT

hamster_nz

Re: Spectral Leakage in DFT / FFT

Share me