Author Topic: FFT window function calculation detail  (Read 5332 times)

0 Members and 1 Guest are viewing this topic.

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3518
  • Country: us
Re: FFT window function calculation detail
« Reply #25 on: December 10, 2022, 04:02:35 pm »
Compute the FFT of the "flat topped" triangle of length N, insert N zeros in the center and back transform.  Now you can see the half sample shift.

Padding a zero gives you an N-1 length operator stored in an N length array.  It really doesn't matter what you do as long as you understand what you are doing in both domains.  If you use N in your calculations rather than N-1 you will get a subtley wrong answer.

@Nominal Animal

The DFT is defined over the semi-closed interval from 0 to 2*Pi.  It is sampled at increments of i/N for i=0 to N.  For N prime and  i=N, N/N = 1.  There are no integers between 0 and 1.  QED

I have an FFT code I wrote based on one that was circulating at UT Austin in my research group.  It is an arbitrary prime factor code in 2 pages of FORTRAN.  It will handle *any* length, but run time is N**2 for prime lengths.

It's a *very* good code and I strongly recommend it for general use if you don't want the footprint of FFTW. 

https://www.jjj.de/fft/fftpage.html

There's an interesting note from someone who used the algorithm in the 70's while working at IBM.  I've got the Ferguson paper, but not the Glassman one.

A bit of trivia from Jorge's FFT page:

I happened upon your FFT page and noticed that you have a generalized radix
FFT algorithm attributed to Glassman (listed as unknown).  I don't know if it
is of any interest to you, but I can cite a reference by him which describes
the algorithm.  It appeared in the IEEE Transactions on Computers in the
February, 1970 issue.  It was written by J. A. Glassman who is listed as
working at Hughes Aircraft in Canoga Park California.  The title of the paper
is "A Generalization of the Fast Fourier Transform" and describes the
calculation of a n-point FFT where n is not restricted to be a power of 2.
He provides a Fortran listing of the algorithm as well as the mathematical
development of it.

I used this algorithm and a modification of the code to implement a piece of
test equipment back in the early 1970's when I was working for IBM.  I have
never seen another paper by Glassman, but I recently came across a reference
to his generalized radix algorithm which provides an alternate (and perhaps
simpler) derivation.  Here is the reference:
 
W. E. Ferguson, Jr., "A simple derivation of Glassman general-n fast
Fourier transform," Comput. and Math. with Appls., vol. 8, no. 6, pp.
401-411, 1982.
Also, in Report AD-A083 811, NTIS, Dec. 1979.

Wayne Torrey
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3518
  • Country: us
Re: FFT window function calculation detail
« Reply #26 on: December 10, 2022, 05:49:56 pm »
I just realized  an important issue  may be getting overlooked.

You *must* pad the series to at *least* 2*N to prevent wraparound.  The other thing you *must* do is to remove any linear trend so that the first and last sample have the same value so there is no discontinuity.  You then *may* want to add back in the linear trend when you return to the original domain.  But if there is a discontinuity between the first and last sample it has the effect of a spike in time and creates a uniform bias to all samples in the frequency domain.  In all cases you must pay close attention to avoiding unidentified phase shifts.  They are lurking everywhere.

If you are resampling by Fourier transform, the input and output series must be the same length .  Failure to ensure that will result in phase errors.  To make it work you vary the padding of the two series so that N1*dT1 = N2*dT2.  This is because of the semi-closed interval over which the DFT is defined.

In general, the speed of an FFT is dependent upon how many factors N has.  Dave Hale's prime factor FFT uses benchmarked data to create the "wisdom" file.  The test exhaustively tests the performance of all factorizations up to some limit.  It then uses those times to decide how much to pad the input.  It's in FFTW.  Long ago, before it was included the FFTW folks benchmarked it against FFTW and it was faster.  Not by much, but still faster.  N = 2**n is the most factors you can have at those sizes.

Edit: A few caveats about the DFT.  There are 3 scaling factors used, 1/N,N & 1/sqrt(N).  Moving data between software packages using different scaling can be a problem if you're not paying attention.  There are also 2 sign conventions for the exponent and 2 intervals commonly used, -Pi to Pi and 0 to 2*Pi.  The former is the mathematical definition, but the implementations are typically 0 to 2*Pi.  F(i*Pi-x) = F((i+2)*Pi-x) which is why the "negative" frequencies are at the end of the transform in descending order.  So there are 12 possible implementations of the same concept.  This is a common source of confusion and error.

Hopefully this has taken a bit of the mystery out of the FFT.  The bottom line is that choosing N so that it is highly composite with lots of small factors provides considerable potential performance gains at very low cost.
Each factor of N is a multiply that can be replaced by an add.
« Last Edit: December 10, 2022, 07:38:50 pm by rhb »
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7194
  • Country: fi
    • My home page and email address
Re: FFT window function calculation detail
« Reply #27 on: December 11, 2022, 08:19:39 am »
The DFT is defined over the semi-closed interval from 0 to 2*Pi.  It is sampled at increments of i/N for i=0 to N.
For i=0 to N-1, inclusive, to be exact.

For N prime and  i=N, N/N = 1.  There are no integers between 0 and 1.  QED
That has nothing to do with sampling a waveform, and completely irrelevant here.  It's like saying "The sky is blue, therefore you are wrong."

What really matters, for prime N, is that \$e^{-2 \pi i / N}\$ is the \$N\$'th complex root of unity.  Until you understand this, and why this is so, you really do not grasp Fourier transforms.

You currently surely understand Cooley-Tukey prime factorization based fast discrete Fourier transform, but that is just a small part of all possible fast discrete Fourier transforms.  Please, open your mind, before you lead more people astray with your limited understanding and thus misleading advice.

I have an FFT code I wrote based on one that was circulating at UT Austin in my research group.  It is an arbitrary prime factor code in 2 pages of FORTRAN.  It will handle *any* length, but run time is N**2 for prime lengths.
Again, just because you think Cooley-Tukey is the only possible approach, does not make it so.

You already have FFTW installed and are familiar with it, so comparing the run time (after gathering wisdom) for say 1048573 (prime), 1048575 (3×5×5×11×31×41), 1048576 (2**20), 1048583 (prime) DFTs would tell you something important about your own understanding.

Because you didn't bother doing even this, shows that you are one of those who relies on their beliefs, and are incapable of considering they might be wrong, and therefore prefer to lead others astray, instead of checking their own understanding.

Despicable behaviour, rhb.
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3518
  • Country: us
Re: FFT window function calculation detail
« Reply #28 on: December 11, 2022, 10:29:35 pm »
Please explain how the FFT works for prime length series.  Where is the computational savings?

As a reminder here's the definition from Brigham.

* dft.pdf (12.44 kB - downloaded 56 times.)

What you describe is the  behavior of Dave Hale's FFT code.  It's likely not computing *any* of the specified series lengths, but a longer series which has been zero padded.  Hale is generally considered the best mathematical programmer of his peers from the Stanford Exploration Project.  The FFTW guys were quite stunned when Hale pretty consistently beat them on difficult lengths in a major benchmark they published.

I spent most of my career in the oil industry fixing errors in  complex numerical codes for which an FFT was just one of many operations and there were often many FFTs, as in millions of 3D FFTs in one code I worked on.  In that environment you *really* care about every nuance of the DFT and fast algorithms for computing it.

For a brief period my job was tuning a 3D FFT algorithm on  Intel i386  & i860 Hypercubes. 
 

Offline electr_peterTopic starter

  • Supporter
  • ****
  • Posts: 1443
  • Country: lt
Re: FFT window function calculation detail
« Reply #29 on: December 11, 2022, 11:01:48 pm »
... with \$a_0 = 0.5\$ being a Hann window, simplifying (via \$\cos(2 x) = 1 - 2 \sin^2(x)\$) to
\$w_n = \sin^2\left(\frac{\pi n}{N}\right), \quad 0 \le n \le N\$
Most window function are defined as above, with \$w_0 = 0\$. Considering what was said above (w.r.t. window length \$L\$ and DFT length \$N\$), zeros are not counted as points inside window. So with \$N\$ data points maximum Hann window length is \$L = N-1\$ and we have one extra zero padded as well (my interpratation). As padded zeros essentially interpolates FFT spectrum points with sinc and do not add any extra info, this leaves me puzzled :-//
Whats is the point of having zeros in window filter definition in the first place? Is this just a "quirk" of going from integral over interval to finite sum? For big \$N\$ it converges to same result, for small \$N\$ window filters probably should not have any zeros to preserve all raw data points. Padding any amount of zeros (or none at all) is up to practical consideration.

Compute the FFT of the "flat topped" triangle of length N, insert N zeros in the center and back transform.  Now you can see the half sample shift.
Not sure I completely understand idea behind this. Can you show an example calculation?

A few caveats about the DFT.  There are 3 scaling factors used, 1/N,N & 1/sqrt(N).  Moving data between software packages using different scaling can be a problem if you're not paying attention.  There are also 2 sign conventions for the exponent and 2 intervals commonly used, -Pi to Pi and 0 to 2*Pi.  The former is the mathematical definition, but the implementations are typically 0 to 2*Pi.  F(i*Pi-x) = F((i+2)*Pi-x) which is why the "negative" frequencies are at the end of the transform in descending order.  So there are 12 possible implementations of the same concept.  This is a common source of confusion and error.
Good point, I crosscheck implementations to be sure. FFT(*) functions have implicit assumptions that should be checked before use.

Just for reference, fft() function help in R states
Quote
The FFT is fastest when the length of the series being transformed is highly composite (i.e., has many factors). If this is not the case, the transform may take a long time to compute and will use a large amount of memory.
Not sure which implementation it uses internally. Simple test (replicated many times to get stable result) with 1024 (2^10), 1155 (3*5*7*11), 1021 (prime) shows ~15x longer calculation for prime and ~1.7x longer for composite vs power of 2.
Code: [Select]
                      relative duration
fft(seq_len(1024))    1.000
fft(seq_len(1155))    1.683
fft(seq_len(1021))   14.663
« Last Edit: December 11, 2022, 11:03:51 pm by electr_peter »
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3518
  • Country: us
Re: FFT window function calculation detail
« Reply #30 on: December 12, 2022, 03:51:38 am »
1024 saves 9 multiplies per sample, 1155 saves 3 and prime does them all.

If it only takes 1.7x the 1024 time then the CPU and compiler are doing an excellent job of overlapping instructions and optimizing execution order.

If you're doing serious work with the FFT you should calculate the number of FPU operations for multiply-add and add-multiply order.

I'd also recommend getting a copy of Bracewell or at least the pictorial dictionary of transform pairs in the back.  With very modest effort and thought you can pretty much sketch the transform of anything on a napkin.  Most things can be approximated by the sum of some simple transforms.


NB The integer values for which the exponent is periodic are N/k not k/N.  The latter has no integer values.  So my previous statement is not strictly correct.  Conceptually it is, but I stated the condition incorrectly.  I haven't done this stuff to any extent for over 10 years.  For serious work I have always looked up the equations, but the Internet doesn't get quite the same standard.

This is not something to slough off.  It's a lot of work, but the dividends are large.  You've probably done some of these things.  My point is a systematic series of experiments is essential to really understanding what is going on and why.
Some test cases you should always run with any new software.  Spike input and all ones.  You should be able to transform back and forth multiple times without significant errors developing.  Same thing for boxcar, Bartlett and Gaussian windows.  This will smoke out errors in indexing, normalization and a host of other pathologies.  Calculate the average noise level at each iteration as you transform from time to frequency and back many times.  Make sure you understand that a "spike" in a series is *never* a spike in fact.  It's a sinc(x) function.

Transform a one followed by 1023 zeros, add 1024 zeros in the middle of the transform array and transform back.  Look closely at the plot.  That is a rough approximation of reality.  For a better view use more zeros
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7194
  • Country: fi
    • My home page and email address
Re: FFT window function calculation detail
« Reply #31 on: December 12, 2022, 09:25:06 am »
Please explain how the FFT works for prime length series.  Where is the computational savings?
First of all, you implied you're familiar with FFTW, but apparently haven't even read their invited IEEE paper from 2005.  If you had, you'd know FFTW implements three prime length algorithms.

Let me describe the theoretical background in a way that everyone can follow.

The mathematical basis for the discrete Fourier transform, equivalently the discrete time Fourier transform of a periodic signal with period \$N\$, is
$$X_k = \sum_{n=0}^{N-1} x_n e^{-2 \pi i \frac{n k}{N}}, \quad k = 0, \dots N-1$$
where \$x_n\$ are the samples in the time domain, and \$X_k\$ are the samples in the frequency domain.  The similarities to both Fourier series (sum) and the Fourier transform (integral) are intentional and not an accident, but to understand exactly why, we'd need to delve deeper into the math, including the sampling theorem.

Exploiting the fact that there are many same coefficients \$\frac{n k}{N}\$ has nothing to with fast discrete Fourier transform; it just means that such coefficients can be precalculated and stored efficiently, if one were to evaluate the sums naïvely.  (If you had say an FPGA or ASIC with many parallel multiply-add units, you could use the sum directly for a fixed \$N\$ DFT with minimal latency.)

If you consider the amount of computation for all \$X_k\$, you'll see that it involves \$(N-1)^2\$ multiplications, and somewhat fewer additions.  This gives it time complexity of \$O(N^2)\$.  Time complexity is not about real-world speed per se for any given \$N\$, it is about how the algorithm scales as \$N\$ increases.  If you implement the naïve algorithm, then you'll have on the order of \$N^2\$ multiplications.

What we call Fast Fourier Transform is a class of algorithms that can reach time complexity \$O(N \log N)\$.

Cooley-Tukey FFT works for composite \$N\$, where \$N\$ is a product of small prime factors.  At the core, it allows you to calculate the DFT of \$N = N_1 N_2\$ samples by calculating \$N_1\$ DFTs of length \$N_2\$, and \$N_2\$ DFTs of length \$N_1\$.   When either one is composite, you can do that DFT recursively, giving you the \$O(N \log N)\$ time complexity.

There are two widely used algorithms used for prime \$N\$, Rader's and Bluestein's FFT algorithms, that can both reach \$O(N \log N)\$ time complexity.  Yes, they tend to be slower than Cooley-Tukey, but they scale equally well; in other words, that while they tend to be slower than Cooley-Tukey -based approaches, the ratio tends to stay constant as \$N\$ increases.  (There is also prime-factor FFT (Good–Thomas algorithm) that works for \$N = N_1 N_2\$ where \$N_1\$ and \$N_2\$ are relatively prime (their largest common divisor is 1), which is unrelated to Cooley–Tukey, but it isn't relevant here for prime \$N\$.)

Both Rader's and Bluestein's FFT algorithms are based on convolution.  It is important to realize that this is a fundamental property of the Fourier transform, and not an approximation of some sort.

Rader's exploits the fact we can express the DFT of size \$N\$ using only a DFT of size \$N-1\$, via \$n = g^q \mod N\$ and \$k = g^{-p} \mod N\$ for an integer \$g\$ and \$q, p = 0 \dots N-2\$, forming a cyclic group, i.e.
$$\begin{aligned}
X_0 &= \sum_{n=0}^{N-1} x_0 \\
X_{g^{-p}} &= x_0 + \sum_{q=0}^{N-2} x_{g^q} e^{-2 \pi i \frac{g^{q-p}}{N}}, \quad p = 0 \dots N-2 \\
\end{aligned}$$
Since \$N-1\$ is composite, the \$N-1\$ DFT can be efficiently calculated using Cooley-Tukey FFT, prime factor FFT, and so on.  In practice, however, because this is a cyclic group, we can instead pad the inner DFT with zeroes to a power of two length, which is trivially calculated (via Cooley-Tukey) in \$O(N \log N)\$ time.  Note that this is not the same as simply padding the original \$x_n\$ with zeroes to a power of two \$N\$, because that would affect the frequency binning, and therefore the results.  Here, by padding the inner DFT, we are exploiting the properties of the cyclic group generated by integer \$g\$, and the convolution theorem; and we get the exact same algebraic answers as we would if we did the Fourier transform naïvely.  This is crucial to understand, even if one is not well versed in algebraic group theory or the convolution theorem.

Bluestein's FFT algorithm is based on describing the DFT as a convolution of two sequences \$a_n\$ and \$b_n\$,
$$X_k = \overline{b}_k \left( \sum_{n=0}^{N-1} a_n b_k \right), \quad k = 0 \dots N-1$$
where the two DFTs can be calculated using zero-padded DFTs of a suitable power of two \$N\$, \$N \gets 2^m \ge 2N - 1\$.  Again, padding these DFTs with zeroes is not the same as simply padding the original \$x_n\$; it is because of the convolution involved that we can obtain the exact desired transform using two DFTs of a larger power of two size.

What you describe is the behavior of Dave Hale's FFT code.
Dave Hale is extremely well versed in convolution theorem (see e.g. https://doi.org/10.1190/1.1827164) –– which is not surprising; these geology folks have dealt with FFT more than anyone else, and even invented AutoTune! –– and although I haven't seen his FFT code, I'd bet your favourite beverage that he used either Rader's or Bluestein's before FFTW folks included an implementation.

The convolution theorem is the key here.  While padding the sample sequence with zeroes to get an easier DFT size does change the frequency binning, you can circumvent that by using convolution and two (or more) DFTs of such extended sizes \$N\$.

I spent most of my career in the oil industry fixing errors in  complex numerical codes for which an FFT was just one of many operations and there were often many FFTs, as in millions of 3D FFTs in one code I worked on.  In that environment you *really* care about every nuance of the DFT and fast algorithms for computing it.
Many, many people use tools effectively without understanding them fully.  Correct understanding is not required for one to be effective.

I have used the Fourier transform, DFTs, DTFTs, and DCTs in multiple different fields myself, so I have had to learn several different facets of the transform.

I first encountered discrete cosine transforms with JPEG image compression and MPEG video compression and MP3 audio compression.  (They achieve compression primarily by quantizing the DCT coefficients, dropping the least important coefficients (depending on the coefficients themselves); MPEG only uses that for I-frames, with P- and B-frames also using preceding or preceding and following frames via motion vectors for small image blocks.)

I did some experimental work with low-energy X-ray image processing (when low-energy solid-state X-ray sensors became commercially viable in the '90s), and although I mostly did 2D kernel stuff, I did spend quite some time to understand 3D tomography, including the practical limitations associated with Fourier slice theorem.

My own field is computational molecular dynamics, specifically parallel and distributed simulation of large classical potential models ("force fields" in chemistry).  While calculus and linear algebra are much more common tools here, Fourier transforms do pop up here and there; say when you have successfully simulated a nanocluster or nanodot, and want to know its phonon spectra.  Ab initio (QM) simulations, however, are all about wave functions, and numerically calculating (using on e.g. density functional theory) the wave function solutions to the Schrödinger equation.  A much deeper understanding of the different domains (spacetime and frequency) is needed, with Fourier transform just the introduction to the field.

For a brief period my job was tuning a 3D FFT algorithm on  Intel i386  & i860 Hypercubes.
So?  I've met highly paid "MPI experts" who claimed that using asynchronous I/O is inherently dangerous, just because they couldn't wrap their mind around it, and could talk enough bullshit to silence anyone who tried to explain that doing I/O asynchronously would allow simulations to complete sooner, because that way communication and computation could be done simultaneously instead of sequentially.

Having a job to fine tune a tool, has really nothing to do with fundamental understanding of said tool.  I've seen "expert programmers employed by the best game developer companies" fine-tune a bit of code, insisting it is the best anyone can do, even when shown that a different algorithm can do it 10× faster, just because they couldn't grok the different algorithm.

I am by no way implying that you are bad at your job.  I'm sure you did good work, and were/are worth every cent.  But your contribution to this thread has had serious errors, especially when claiming that FFT does not work for prime sizes.  It does, as proven above; these algorithms scale just as well as Cooley-Tukey does.  Depending on the algorithm used, yes, the prime-size ones tend to be a bit slower –– by a constant factor –– than say power of two sizes, but they still qualify as "fast Fourier transform" because they scale equally well, \$O(N \log N)\$.  At least in the literature, this time complexity is the definition of FFT.

Current versions of FFTW use "wisdom" to determine how to most efficiently calculate an FFT (or other transform) of given size \$N\$.  It has several ways of doing so, with the slowest and most thorough simply brute-forcing the various methods, and picking the fastest one for the current machine.  Key is, they all produce the exact same results (except for inherent rounding errors, because we're using limited-precision floating-point number formats).  This alone should tell you (generic, plural) that Cooley-Tukey-based FFTs are not the only ones, even though they are the fastest ones currently known, and therefore somewhat preferable in high-performance computing.  In particular, radix-2 (powers of two) FFTs can be efficiently implemented on SIMD architectures like x86-64 (with SSE, AVX, etc.).

I hope this opens your eyes, or at least angers you enough so you'll check your understanding.  What I find despicable, is misleading others based on your own belief, even when shown proof that you could verify for yourself, but you won't, because it would hurt your ego.  Despicable.
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7194
  • Country: fi
    • My home page and email address
Re: FFT window function calculation detail
« Reply #32 on: December 12, 2022, 09:45:26 am »
Whats is the point of having zeros in window filter definition in the first place?
The point is to have zero zeros, or exactly one zero at one end of the sequence.  This lets you choose whether your window function peak (1.0) width is an odd or an even number in width.  This affects the frequency response of the filter, even though the effect goes to zero as \$N\$ goes to infinity.

I do believe I already wrote that whatever window function you use, you should examine the frequency response of your implementation, instead of relying on the theoretical response given a particular algebraic window function shape.  If you do this, you'll find out in practice how large the aforementioned effect is with any given window size.  Similarly, experimenting with the actual FFT window size (not window function size), between \$N\$ and \$2N\$, repeating the same contents for the larger one, should tell you a lot about how the generated spectrum depends on the FFT window size (\$N\$).

Theory is one thing, but we're using discrete math at limited precision, so real-world behaviour rules.

(What I am getting at, is that the FFT size defines the range and interval between frequency bins, and the window function the frequency response.  A slightly larger FFT is often faster to compute, but you still want a specific frequency response, so you may choose the different frequency binning but the same frequency response by padding your sample data with zeroes.)



The reason my suggested FFT sizes had two primes and two composites was exactly that "FFT" is not about the speed of the transform at a given size, but how the speed scales as the size increases.  The naïve DFT requires on the order of \$O(N^2)\$ operations, but anything that can do it in the order of \$O(N \log N)\$ operations is FFT.

For a better view, you'd compare say four different composite sizes to four different power of two sizes using FFTW (because it uses the aforementioned prime-based algorithms –– apparently, R only uses Cooley-Tukey), after brute-force gathering wisdom for the sizes, and see how they scale.

Say, 1024, 2048, 4096, and 8192; and roughly corresponding primes 1021, 2053, 4093, and 8191.  Because of cache effects and such, there will be jumps and such, but the ratio between power-of-two size and prime size is limited; the ratio will reach a limit, and stay there.  This ratio is the relative speed between different algorithms, and is the reason why one might choose a power-of-two size instead of a more "natural" size.  However, any Fourier transform that scales as \$O(N \log N)\$ –– note, ignoring any scale factors! –– is a Fast Fourier Transfrom by definition.
« Last Edit: December 12, 2022, 09:47:46 am by Nominal Animal »
 

Offline gf

  • Super Contributor
  • ***
  • Posts: 1425
  • Country: de
Re: FFT window function calculation detail
« Reply #33 on: December 12, 2022, 01:26:13 pm »
I do believe I already wrote that whatever window function you use, you should examine the frequency response

An in order to do that, it is a good idea to pad the window function with a large number of zeros, in order to plot the frequency response at high resolution. Too coarse sampling may give a wrong impression of the actual filter shape (for instance, if the frequency response of a Hann window happens to be sampled only at its peak and at its zeros, then you won't see the presence, height and shape of the side lobes).
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7194
  • Country: fi
    • My home page and email address
Re: FFT window function calculation detail
« Reply #34 on: December 12, 2022, 03:55:32 pm »
I do believe I already wrote that whatever window function you use, you should examine the frequency response

An in order to do that, it is a good idea to pad the window function with a large number of zeros, in order to plot the frequency response at high resolution. Too coarse sampling may give a wrong impression of the actual filter shape (for instance, if the frequency response of a Hann window happens to be sampled only at its peak and at its zeros, then you won't see the presence, height and shape of the side lobes).
Exactly, to get high-resolution frequency bins.  Like I added in parens, the size of the DFT determines the range and interval of the frequency bins, and the window function its frequency response (including phase effects).  Because of how the frequency bins are distributed, the most useful DFT sizes are multiples of the actual window size you intend to use.

I guess I should amend my assertion then, that zeros are about having zero or exactly one zeros at either end of the window to get the desired properties of the frequency response, because there are two completely separate things one may do with the zeros:
  • Give the desired frequency response via the window function having an odd or even number of nonzero coefficients (odd or even number of maximum coefficients)
  • Pad the DFT window to a size that is easy to compute and gives nice high-resolution frequency bins

Edited: Removed the continuous Fourier transform, because I found some errors I'll correct in a later post.
« Last Edit: December 13, 2022, 08:34:58 pm by Nominal Animal »
 

Offline electr_peterTopic starter

  • Supporter
  • ****
  • Posts: 1443
  • Country: lt
Re: FFT window function calculation detail
« Reply #35 on: December 12, 2022, 11:48:40 pm »
I get much better picture of weights effect on FFT due to such insightful comments. Essential point - always check actual implementation for frequency response, as it may not be what you expect. With checks it is clear which triangle window implementation works better with odd/even \$N\$.
(Implicit) assumption of even \$N\$ has strong implications for window weights, most sources do not discuss evenness topic if at all. Theory and practice differ slightly, this was the root of my misunderstanding.

Formula above (lacks normalization by \$2 \pi i k\$) is good for quick approximation, but is has some "wavy" amplitude error, especially near \$k = 0.5\$ - it shows lower amplitude (check rectangle window \$N = 21\$ vs padded FFT). Compare this function with padded FFT as well before use.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15799
  • Country: fr
Re: FFT window function calculation detail
« Reply #36 on: December 12, 2022, 11:56:53 pm »
Yep. And use a decent tool for verifying your assumptions. Can be Matlab, Scilab, Octave, Julia, whatever you're comfortable with.
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3518
  • Country: us
Re: FFT window function calculation detail
« Reply #37 on: December 13, 2022, 04:45:38 pm »
I suggest you read the Wikipedia page you linked to.  According to it Rader is computing the DFT by padding with zeros to a length that can be computed quickly.  Hale's innovation, if original to Dave, is in testing series lengths near prime lengths to find the quickest longer transform.  The FFTW "wisdom" file is machine and compiler specific.

"Geologists"  rarely, if ever, use the FFT.  Geophysicists, however, are another matter.  The entire field of digital signal processing started at MIT in Norbert Weiner's Geophysical Analysis Group.  And geophysicists were doing FFTs *before* Cooley-Tukey appeared.  They never published because they thought the interchange of order of operation was obvious to anyone using a desk calculator and pad of paper.

In seismic processing a "window" is the length of the non-zero portion of a time series.  Zeros on the end don't count.  You can easily verify this by comparing the transform of 99 ones followed by 925 zeros and 100 ones followed by 924 zeros.  A very common task in seismic processing is to estimate the variation of the signal spectrum over a series of windows in time.  Probably the most common window shape for that is a raised cosine edge boxcar.

If you want to define "window" differently you are free to do so.  I've worked in 7 seismic processing environments.  Each one has required about 6 months to learn the local jargon well enough to know who is bullshitting and who actually knows what they are doing.  I went into reflection seismology because there were no jobs for a hard rock petrologist.  So I switched from light to elastic waves and from hard rocks to soft rocks.

Reg
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7194
  • Country: fi
    • My home page and email address
Re: FFT window function calculation detail
« Reply #38 on: December 13, 2022, 10:34:18 pm »
In the hopes that I might post something useful that is not derailed by misleading advise from others, I'll sum up the important details here.

First, the discrete windowed Fourier transform is defined as
$$X_k = \sum_{n = 0}^{N - 1} w_n x_n e^{-2 i \pi k n / N} \tag{DFT}\label{DFT}$$
where \$N\$ is the size of the Fourier transform, \$w_n\$ are the window coefficients, \$x_n\$ are the (time-domain) samples, and \$X_k\$ are the frequency domain complex values, with \$\lvert X_k \rvert\$ being the amplitude, and \$\arg X_k\$ the phase.

(Note that if you consider \$w_n x_n\$ and \$X_k\$ \$N\$-dimensional vectors \$\mathbf{x}\$ and \$\mathbf{X}\$, then the above can be written in linear algebra terms as \$\mathbf{X} = \mathbf{F} \mathbf{x}\$, where \$\mathbf{F}\$ is an \$N \times N\$ square matrix, \$\mathbf{F}_{j k} = \omega^{j k} / \sqrt{N}\$, where \$\omega = e^{-2 \pi i / N}\$ is the \$N\$'th complex root of unity, with \$\omega^n = \omega^{n \mod N}\$.  \$\mathbf{F}\$ is often called the DFT matrix.  One way to achieve a Fast Fourier transform is to exploit symmetries in this matrix.)

The frequency response of a discrete window function is obtained by setting \$x_n = 1\$, i.e.
$$X_k = \sum_{n = 0}^{N - 1} w_n e^{-2 i \pi k n / N}$$
This is what you want to use when evaluating the response of your discrete window, using the same machinery you use for the DFT calculation.

You can evaluate it at an arbitrary frequency \$f\$,
$$X(f) = \sum_{n \lvert w_n \ne 0} w_n e^{-2 i \pi f n}$$
such that \$X_k = X(k/N)\$; and similarly for the windowed Fourier transform,
$$X(f) = \sum_{n \lvert w_n \ne 0} w_n x_n e^{-2 i \pi f n} \tag{WFT}\label{WFT}$$

Discrete Fourier Transforms, including all Fast Fourier Transforms, on real data calculate \$\eqref{DFT}\$ for frequencies \$f = k/N\$ for \$k = 0 \dots \lfloor N/2 \rfloor\$.  \$\eqref{DFT}\$ and \$\eqref{WFT}\$ are exactly equivalent.

This shows two things
  • \$N\$, the size of the FFT, only affects the frequency binning, the frequencies \$f = k/N\$ where the discrete transform \$\eqref{DFT}\$ samples the continuous transform \$\eqref{WFT}\$.
  • Only samples at nonzero window coefficients \$w_n\$ contribute to the frequency response.  In other words, window coefficients \$w_n = 0\$ are useless, unless they are only used to pad the FFT to a desired larger size.
When we have a symmetric window, \$w_j = 0\$ for \$j \lt 0\$ and \$j \gt J\$, and \$w_j = w_{J - j}\$, there may be a difference in the frequency response \$X(f)\$ depending on whether \$J\$ is odd or even; i.e. whether the peak \$w_j = \max(w)\$ consists of an odd or an even number of coefficients.
This difference depends on the exact window function shape.  For many, like the Hann window, there is no real difference at all.

To capture all information in a signal using windowed DFT, we need to overlap the windows, by generating a new DFT using samples \$x_{n+L}\$.  The exact amount of overlap (or the amount of "time" covered by a single DFT window) depends on the windowing function, not on the FFT size \$N\$.
For a rectangular window \$w_j = 1\$ for \$j = 0 \dots J-1\$, and \$w_j = 0\$ for \$j \lt 0\$ and \$j \ge J\$, no overlap is needed.
For many windowing functions, \$L < J\$.

As shown earlier, the zero coefficients \$w_j\$ do not contribute to the frequency spectrum at all.  Some windowing functions use \$w_0 = 0\$ because it is convenient, and define the width of the window by including this useless zero coefficient –– again, useless, because it does not contribute.  However, they often define the required overlap or \$L\$ in terms of their definition of the width of the window.  It is very common to define \$L = \lfloor J/2 \rfloor\$ (half the width of the window, including leading and/or trailing zeroes in the window function coefficients, with symmetry defined as \$w_j = w_{J-j}\$.

Even if you were to omit all leading and trailing zero window coefficients \$w_j\$, your window overlap \$L\$ does not change.
This means that in terms of the required overlap \$L\$, and the size of the FFT \$N\$, the most useful way to define the window function width \$W\$ is in terms of distance between first and last nonzero coefficients, i.e. \$W = j_{\max} - j_{\min} + 1\$, where \$w_{j_{\min}}\$ is the smallest-index nonzero coefficient, and \$w_{j_{\max}}\$ is the largest-index nonzero coefficient.  Then, \$W\$ is the number of samples needed per FFT, and \$L \le W \le N\$.  When \$N \gt W\$, the data is simply padded with zeroes.  If your sample rate is \$S\$ samples per second, you'll need to \$W S / L\$ multiplications per second to window the incoming signal, and compute \$S/L\$ FFTs of size \$N\$ each second.

The only issue with this definition of \$W\$ is that the way window function overlaps are defined in the literature, you cannot rely on \$L / W/2\$, and must instead calculate \$L\$ based on whatever definitions are commonly used for that window function.

With regards to "Fast Fourier Transforms", the definition \$\eqref{DFT}\$ for \$k = 0 \dots N-1\$ involves \$(N-1)^2 \approx O(N^2)\$ multiplications.  Any method that computes the same in the order of \$O(N \log N)\$ operations, qualifies as a "Fast Fourier Transform".  Since the \$O\$ notation does not contain any coefficients, just the asymptotic behaviour: not how fast an approach is, but how it scales when \$N\$ increases.

Fastest Fourier Transforms are based on power-of-two \$N\$, but there are almost as fast implementations for composite \$N\$ with small factors (like say \$N = 3888 = 2^4 3^5\$).  Rader's and Bluestein's Fast Fourier Transforms can handle prime \$N\$, but tend to be slower than the fastest algorithms for similar \$N\$ that are composite or powers of two, by some constant factor depending on the implementation details, but typically on the order of 2× to 10×.
Both also rely on convolution: Rader's calculates the exact DFT using the convolution of two DFTs of size \$N-1\$ (which must be composite if \$N\$ is prime), and Bluestein's by using the convolution of two DFTs of size \$N_2 \ge 2 N + 1\$ where \$N_2\$ is composite with small prime factors, or a power of two.

Even though Bluestein's pads the two DFTs with zeroes, it still calculates the exact DFT of the original \$N\$, because it is a convolution of two DFTs of suitably defined subsequences.  In the worst case, \$N\$ being a Mersenne prime, Bluestein's does two DFTs of size \$2 N + 2\$, followed by \$N+1\$ complex multiplications.

It is crucial to understand that both Rader's and Bluestein's calculates the exact DFT for the given \$N\$, per the discrete Fourier transform formula, even though they use convolutions of discrete Fourier transforms at other sizes to do so.  They are no trick or approximation, they are mathematically exact.  So, when one needs a frequency spectrum with frequency bins at specific inverse prime intevals, \$N\$ prime, Rader's (if \$N-1\$ has only small prime factors) or Bluestein's (if \$2 N + 2\$ is a power of two, or when \$N-1\$ has large prime factors) will do that for you, scaling just like all FFTs do as \$N\$ increases, using at most a constant times the number of operations compared to if \$N\$ was a power of two (and that constant depending on the implementation, typically on the order of 2 to 10).
« Last Edit: December 13, 2022, 11:40:25 pm by Nominal Animal »
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7194
  • Country: fi
    • My home page and email address
Re: FFT window function calculation detail
« Reply #39 on: December 14, 2022, 02:09:58 am »
As a proof of my advice, please take a look at my Window Function Spectrum Analysis page.  It is a self-contained HTML page (so you can download it and open it locally; no network connection needed).

For a practical example, put coefficients
    0.25 0.50 0.75 1.00 0.75 0.50 0.25 0.00
in the first (upper/red) text area, and coefficients
    0.25 0.50 0.75 1.00 1.00 0.75 0.50 0.25
int the second (lower/blue) text area, and click on the Calculate button.  This shows you the difference between odd and even-width triangle filters.

The page shows you the frequency spectrum you should expect if you take an FFT of those coefficients (real, imaginary format).  Feel free to compare the shown values to whatever FFT implementation you use, with the FFT window size the same as the number of coefficients input.

In the plot, vertical axis is normalized to the sum of the coefficients for each spectrum.   The vertical lines show you where the FFT samples the spectrum (if your FFT window size were the same as the number of coefficients).  If you add eight zeros to both text areas, instead of five vertical lines, you'll see nine.  Padding or prepending zeros to the coefficients only changes the FFT size (where the vertical lines are shown), and you see how that affects the sampling of the continuous spectrum.
The horizontal axis is linear in frequency, with DC at the left edge, and half the sample rate at the right edge.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf