Author Topic: Making a sound camera  (Read 10014 times)

0 Members and 1 Guest are viewing this topic.

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Making a sound camera
« on: April 30, 2018, 11:27:42 pm »
Hello,
Wouldn't it be nice to see sound, like this?
The common way to do that is with a FPGA pulling FFT of each microphone and calculating some phase delta.
Is it possible to do that without resorting to programming, maybe even using analog components?

In other word:
M-M-M
|   |   |
M-M -M
|   |   |
M-M-M
for each M determine the phase shift with the neighbor M of the frequency with the highest amplitude
light up the LED on the other side of the PCB for each M with (phase shift if it's positive, meaning the sound is ahead of the neighbor mics) * amplitude

EDIT: changed the title, since we're still figuring out how to make this thing.
« Last Edit: May 01, 2018, 11:04:31 pm by daslolo »
nine nine nein
 

Offline Sparker

  • Regular Contributor
  • *
  • Posts: 56
  • Country: ru
The dumbest solution to this problem coming to my mind is:
If you have a sinewave with frequency w and a delayed sinewave with same frequency phi, you multiply them and:
cos(w*t)*cos(w*t + phi) = 0.5*cos(2*w*t + phi) + 0.5*cos(phi)
then you can low-pass filter it to get rid of the high frequency part, and the low-frequency part depends only on to the phase difference of the sinewaves  :-+.
Now here comes a problem, for different frequencies the same phase shift means different distance traveled: delta-phi = 2*pi*distance/wavelength, so you must have some way to differentiate different frequencies. Really this thing is just asking for DSP and FFT.  :)
« Last Edit: April 30, 2018, 11:55:13 pm by Sparker »
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
I like your solution, it sings.
I'd really like to not use chips that I have to program so maybe have each microphone go through N filters that barely overlap before going to their own phase detectors.
M-F(n)-PHD(n)            \
   -F(n+1)-PHD(n+1)   - combine to form RGB - LED
   -F(n+2)-PHD(n+2)   /
    ....
nine nine nein
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
M-M-M
for each M determine the phase shift with the neighbor M of the frequency with the highest amplitude
The problem that, each M has it's own highest magnitude frequency.  FFT is the only solution, not necessarily FPGA - small uCPU board like stm32 , sam3x or atmega328 would be sufficient
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
M-M-M
for each M determine the phase shift with the neighbor M of the frequency with the highest amplitude
The problem that, each M has it's own highest magnitude frequency.  FFT is the only solution, not necessarily FPGA - small uCPU board like stm32 , sam3x or atmega328 would be sufficient
In my experience with the esp32 ADC conversion takes time, 1uS on the spec sheet, way more if accessed via Arduino lib (1K sample take 10 ms!!!) and I think (haven't figured out how to test that) that if I ADC one pin, then another one, the other one will be sampled 1uS after the first one, so there will be already phase shift caused by sampling.
Now I may be wrong and maybe there is a way to freeze all the ADC buckets at the same time.
nine nine nein
 

Offline Sparker

  • Regular Contributor
  • *
  • Posts: 56
  • Country: ru
You probably need N ADCs or ADCs that can do simultaneous sampling from many channels and then give you data back in serial. They must be clocked simultaneously, but when the conversion is over you can use whatever method to read data from them before the next conversion has started. Maybe their serial interfaces can be connected in serial? Maybe they have parallel interfaces and you can just connect them in parallel to the reading port of your MCU and use chip select of specific ADC to read the data from it.  :-// So many possibilities and many kinds of ADCs.
If you care only about 0...10 kHz frequency band, you can sample at about 20 kHz, which is 50 us between samples. Probably an AVR can handle this data rate easily.
Just store this data into a buffer, as many samples as you can, then you can do whatever you like with it. I'd offload it to matlab/octave and try to process it there first.
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
if I ADC one pin, then another one, the other one will be sampled 1uS after the first one, so there will be already phase shift caused by sampling.
Now I may be wrong and maybe there is a way to freeze all the ADC buckets at the same time.
You don't need sampling to be  synchronous. Sure, with one adc there will be phase offset, but it's a CONSTANT defined by sampling rate.  Just subtract it out of the phase difference.
 
The following users thanked this post: Sparker, werediver

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
if I ADC one pin, then another one, the other one will be sampled 1uS after the first one, so there will be already phase shift caused by sampling.
Now I may be wrong and maybe there is a way to freeze all the ADC buckets at the same time.
You don't need sampling to be  synchronous. Sure, with one adc there will be phase offset, but it's a CONSTANT defined by sampling rate.  Just subtract it out of the phase difference.
This dawned on me was I was driving to buy chocolate! :D
How do I evaluate phase shift ?
nine nine nein
 

Offline Sparker

  • Regular Contributor
  • *
  • Posts: 56
  • Country: ru
Indeed you don't need to sample synchonously. :palm: It just must be periodic sampling, obviously.
For phase shift check the "Shift in time" DFT property: https://en.wikipedia.org/wiki/Discrete-time_Fourier_transform
So at the DFT output you'll have phases of frequency bins increased proportionally to the frequency of current bin and proportionally to the time shift.
« Last Edit: May 01, 2018, 02:29:44 am by Sparker »
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Is the phase of an FFT the imaginary component?
Could using N microphones that output PWM instead of analog help lift some load off the uC?
nine nine nein
 

Offline Sparker

  • Regular Contributor
  • *
  • Posts: 56
  • Country: ru
No, phase is the angle between the vector and the real axis, like the phase used in a complex amplitude or a complex impedance.

Quote
Could using N microphones that output PWM instead of analog help lift some load off the uC?
What do you mean?

Also I don't understand how the whole signal processing from NI works. Maybe someone could explain? :)
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Are you sure you are talking phase delta? That would be different for different frequencies....

Don't you just want delay, which will be constant for all frequencies? (The ~ 3 ms per meter of difference in distance travelled)

Or am I confused? ( I often am)


Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 
The following users thanked this post: The Soulman

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Are you sure you are talking phase delta? That would be different for different frequencies....

Don't you just want delay, which will be constant for all frequencies? (The ~ 3 ms per meter of difference in distance travelled)

Or am I confused? ( I often am)

Phase sounds cool but maybe you are right.
The end goal is to show a blob at the mic that gets hit by the sound first on the array.
nine nine nein
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Hum. First guess.

Look for a distictive sound (e.g  sudden loudness) on one center channel.

Window it (fade either end) so you get the pattern you are looking for. Make the window smaller than the FFT size (e.g. half the FFT size).

FFT it, so you get the complex spectrum signature you are looking for.

Take all the channels. FFT them to get a frequency specturm for each channel.

Multiply each by the signature's spectrum.

Inverse FFT each of the resulting spectums. That will give you a time series of how well the signal matches the signature pattern. You should have a spike in each channel (and maybe echos).

Identify the principal spike in each of the channel. Use the relative delay to triangulate the source.

(I used this sort of technique to find direct synthesis spread spectrum signals once)

Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline Sparker

  • Regular Contributor
  • *
  • Posts: 56
  • Country: ru
I guess that since we want to measure time delays between channels, the cross-correlation function fits more this need.
Imagine you have two mics, both hit by the same signal with unknown delay. You record the signals and cross-correlate them. The peak at the cross-correlation function output will be at the time which equals the time offset  :-+.
Now add a second dimension orthogonal to the first  one and multiply the two cross-correlation function outputs by each other(but one is horizontal, the other is vertical) and you will have a 2D image with a peak rotated towards the sound source. Pass it though some 2D filter (like the gaussian smoothing filter) and, I guess, you should get the same kind of result as the guys in the video.  ::)
Probably three mics will be sufficient for that, one in the middle, the other one X centimeters above it, third one X centimeters to the right of it.  :-/O
« Last Edit: May 01, 2018, 12:59:55 pm by Sparker »
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
@Sparker and @hamster_nz you're both describing the same thing in your own words. I have no idea to do that at the moment but it looks like FFT is the key anyway. The good thing is I don't need much visual resolution in term of displaying the frequencies so a 128 bin FFT should be enough. And this is very fast @3ms on an esp32.

And earlier I was thinking of using i2s mems to bypass ADC conversion time ... it seemed like a good idea but how long will transfering the i2s stream take? I think it'll take long than the instant voltage read and it equates to moving the time delay from conversion to data transfer. Maybe someone can confirm?
nine nine nein
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
Don't know about esp32, but as any digital bus I2C trasfer time is defined by clock rate * number of bits. I2C standard AFAIK supports 100k, 400k, and 3.4M clock. See what is the clock of your esp32  driver. For myself, I usually  measure arduino speed performance using any spare digital pin and a scope, setting high/low at the start and the end of time critical function.
millis() also does same thing, in software.
Regarding size of the fft, depends on lower frequency range & bandwidth. For example, 10 k sampling and 128 bins provides just 10000/128 = 78.125 Hz freq. resolution. It means, you can't measure below this value, and what is more important, error 100% at 78 Hz, and 10% or so (not sure if it's linear, but you get an idea, and do math work to verify ) at 780 Hz, that may not be acceptable.

Cross correlation is the same thing as DFT. Has very little or zero practical value, since FFT about thousands time faster. 
 

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
Is it possible to do that without resorting to programming, maybe even using analog components?
If you limit the frequency of the signals and then clip them, you could probably do a phase detector a 4046 ... as long as the delay is smaller than the inverse of the highest frequency it can determine the delay. That would only work in a tiny cone though. Doing more than that discretely would quickly get very hard AFAICS.
 

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
Don't know about esp32, but as any digital bus I2C trasfer time is defined by clock rate * number of bits. I2C standard AFAIK supports 100k, 400k, and 3.4M clock.

I am afraid that you confused I2S with I2C which is totally different animal:

And earlier I was thinking of using i2s mems to bypass ADC conversion time ... it seemed like a good idea but how long will transfering the i2s stream take?

Any ADC will delay signal for some time, so it does not matter - it is i2s ADC or not. Mems microphones with i2s will be fine indeed.
« Last Edit: May 01, 2018, 08:53:04 pm by ogden »
 

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
Is it possible to do that without resorting to programming, maybe even using analog components?
If you limit the frequency of the signals and then clip them, you could probably do a phase detector a 4046 ... as long as the delay is smaller than the inverse of the highest frequency it can determine the delay. That would only work in a tiny cone though. Doing more than that discretely would quickly get very hard AFAICS.

Such clipping approach would work only in textbook example for single signal source of clean sine tone, not real world application where various complex sounds are coming from multiple sources. This application indeed needs DSP processing, thus programming.
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
Don't know about esp32, but as any digital bus I2C trasfer time is defined by clock rate * number of bits. I2C standard AFAIK supports 100k, 400k, and 3.4M clock.

I am afraid that you confused I2S with I2C which is totally different animal:
Ooops, mea culpa, misreading.  Didn't know esp32 could support I2S. I interfaced arduino DUE to  Wm8731 over I2S /DMA, but it's for 1 CODEC - 2 channels In /Out only.  Though bringing I2S into picture would make project much harder to build /program.
Here is a piece of code, to give an overview of complexity:
Code: [Select]
void DMAC_Handler(void)
{
  uint32_t dma_status;

//  digitalWrite( test_pin, HIGH);
  dma_status = DMAC->DMAC_EBCISR;
  // BUG : ISR = if (dma_status & (DMAC_EBCIER_CBTC0 << SSC_DMAC_RX_CH))
  if (dma_status & (DMAC_EBCISR_BTC0 << SSC_DMAC_RX_CH)) {
    flag_adcv = 1;
    }

  if (dma_status & (DMAC_EBCISR_BTC0 << SSC_DMAC_TX_CH)) {
    flag_dacv = 1;
    }
  if(flag_adcv && (!flag_dacv))  adcdac_first = 1; 
  if((!flag_adcv) && flag_dacv)  adcdac_first = 2; 
//  digitalWrite( test_pin,  LOW);
}

void ssc_dma_cfg()
{
  desc[0].ul_source_addr      = (uint32_t)(&SSC->SSC_RHR);
  desc[0].ul_destination_addr = (uint32_t) inp[0];
  desc[0].ul_ctrlA = DMAC_CTRLA_BTSIZE(INP_BUFF) |/* Set Buffer Transfer Size */
                     DMAC_CTRLA_SRC_WIDTH_WORD |  /* Source transfer size is set to 32-bit width */
                     DMAC_CTRLA_DST_WIDTH_WORD;   /* Destination transfer size is set to 32-bit width */

  desc[0].ul_ctrlB = DMAC_CTRLB_SRC_DSCR_FETCH_DISABLE |
                     DMAC_CTRLB_DST_DSCR_FETCH_FROM_MEM |           
                     DMAC_CTRLB_FC_PER2MEM_DMA_FC |                 
                     DMAC_CTRLB_SRC_INCR_FIXED |             
                     DMAC_CTRLB_DST_INCR_INCREMENTING;       
  desc[0].ul_descriptor_addr = (uint32_t) &desc[1];

  desc[1].ul_source_addr      = (uint32_t)(&SSC->SSC_RHR);
  desc[1].ul_destination_addr = (uint32_t) inp[1];
  desc[1].ul_ctrlA = DMAC_CTRLA_BTSIZE(INP_BUFF) |   
                     DMAC_CTRLA_SRC_WIDTH_WORD |                 
                     DMAC_CTRLA_DST_WIDTH_WORD;                   

  desc[1].ul_ctrlB = DMAC_CTRLB_SRC_DSCR_FETCH_DISABLE |
                     DMAC_CTRLB_DST_DSCR_FETCH_FROM_MEM |           
                     DMAC_CTRLB_FC_PER2MEM_DMA_FC |                 
                     DMAC_CTRLB_SRC_INCR_FIXED |             
                     DMAC_CTRLB_DST_INCR_INCREMENTING;           
  desc[1].ul_descriptor_addr = (uint32_t) &desc[0];       
 
  DMAC->DMAC_CH_NUM[SSC_DMAC_RX_CH].DMAC_SADDR = desc[0].ul_source_addr;
  DMAC->DMAC_CH_NUM[SSC_DMAC_RX_CH].DMAC_DADDR = desc[0].ul_destination_addr;
  DMAC->DMAC_CH_NUM[SSC_DMAC_RX_CH].DMAC_CTRLA = desc[0].ul_ctrlA;
  DMAC->DMAC_CH_NUM[SSC_DMAC_RX_CH].DMAC_CTRLB = desc[0].ul_ctrlB;
  DMAC->DMAC_CH_NUM[SSC_DMAC_RX_CH].DMAC_DSCR  = desc[0].ul_descriptor_addr;

  DMAC->DMAC_CHER = DMAC_CHER_ENA0 << SSC_DMAC_RX_CH;
  ssc_enable_rx(SSC);

// transmit
  desc[2].ul_source_addr      = (uint32_t) out[0];
  desc[2].ul_destination_addr = (uint32_t) (&SSC->SSC_THR);
  desc[2].ul_ctrlA = DMAC_CTRLA_BTSIZE(INP_BUFF) |
                     DMAC_CTRLA_SRC_WIDTH_WORD | 
                     DMAC_CTRLA_DST_WIDTH_WORD;   

  desc[2].ul_ctrlB = DMAC_CTRLB_SRC_DSCR_FETCH_FROM_MEM |
                     DMAC_CTRLB_DST_DSCR_FETCH_DISABLE |       
                     DMAC_CTRLB_FC_MEM2PER_DMA_FC |                 
                     DMAC_CTRLB_SRC_INCR_INCREMENTING |             
                     DMAC_CTRLB_DST_INCR_FIXED;                       
  desc[2].ul_descriptor_addr = (uint32_t) &desc[3];       

  desc[3].ul_source_addr      = (uint32_t) out[1];
  desc[3].ul_destination_addr = (uint32_t) (&SSC->SSC_THR);
  desc[3].ul_ctrlA = DMAC_CTRLA_BTSIZE(INP_BUFF) |
                     DMAC_CTRLA_SRC_WIDTH_WORD |               
                     DMAC_CTRLA_DST_WIDTH_WORD;               

  desc[3].ul_ctrlB = DMAC_CTRLB_SRC_DSCR_FETCH_FROM_MEM |
                     DMAC_CTRLB_DST_DSCR_FETCH_DISABLE |       
                     DMAC_CTRLB_FC_MEM2PER_DMA_FC |                 
                     DMAC_CTRLB_SRC_INCR_INCREMENTING |             
                     DMAC_CTRLB_DST_INCR_FIXED;                       
  desc[3].ul_descriptor_addr = (uint32_t) &desc[2];       

  DMAC->DMAC_CH_NUM[SSC_DMAC_TX_CH].DMAC_SADDR = desc[2].ul_source_addr;
  DMAC->DMAC_CH_NUM[SSC_DMAC_TX_CH].DMAC_DADDR = desc[2].ul_destination_addr;
  DMAC->DMAC_CH_NUM[SSC_DMAC_TX_CH].DMAC_CTRLA = desc[2].ul_ctrlA;
  DMAC->DMAC_CH_NUM[SSC_DMAC_TX_CH].DMAC_CTRLB = desc[2].ul_ctrlB;
  DMAC->DMAC_CH_NUM[SSC_DMAC_TX_CH].DMAC_DSCR  = desc[2].ul_descriptor_addr;

  DMAC->DMAC_CHER = DMAC_CHER_ENA0 << SSC_DMAC_TX_CH;
  ssc_enable_tx(SSC);
}

void init_dma() {
  uint32_t ul_cfg;

  pmc_enable_periph_clk(ID_DMAC);
  DMAC->DMAC_EN &= (~DMAC_EN_ENABLE);
  DMAC->DMAC_GCFG = (DMAC->DMAC_GCFG & (~DMAC_GCFG_ARB_CFG)) | DMAC_GCFG_ARB_CFG_ROUND_ROBIN;
  DMAC->DMAC_EN = DMAC_EN_ENABLE;

  ul_cfg = 0;
  ul_cfg = DMAC_CFG_SRC_PER(SSC_DMAC_RX_ID) |
           DMAC_CFG_SRC_H2SEL |
           DMAC_CFG_SOD_DISABLE | //SOD: Stop On Done
           DMAC_CFG_FIFOCFG_ALAP_CFG;

  DMAC->DMAC_CH_NUM[SSC_DMAC_RX_CH].DMAC_CFG = ul_cfg;
  DMAC->DMAC_CHDR = DMAC_CHDR_DIS0 << SSC_DMAC_RX_CH;

//transmit
  ul_cfg = 0;
  ul_cfg = DMAC_CFG_DST_PER(SSC_DMAC_TX_ID) |
           DMAC_CFG_DST_H2SEL |
           DMAC_CFG_SOD_DISABLE | //SOD: Stop On Done
           DMAC_CFG_FIFOCFG_ALAP_CFG;

  DMAC->DMAC_CH_NUM[SSC_DMAC_TX_CH].DMAC_CFG = ul_cfg;
  DMAC->DMAC_CHDR = DMAC_CHDR_DIS0 << SSC_DMAC_TX_CH;
//

  NVIC_EnableIRQ(DMAC_IRQn);

  DMAC->DMAC_EBCIER = DMAC_EBCIER_BTC0 << SSC_DMAC_RX_CH;

  DMAC->DMAC_EBCIER = DMAC_EBCIER_BTC0 << SSC_DMAC_TX_CH;

  DMAC->DMAC_EBCISR;
}

void init_ssc() {
  clock_opt_t        rx_clk_option;
  data_frame_opt_t   rx_data_frame_option;
  clock_opt_t        tx_clk_option;
  data_frame_opt_t   tx_data_frame_option;

  memset((uint8_t *)&rx_clk_option,        0, sizeof(clock_opt_t));
  memset((uint8_t *)&rx_data_frame_option, 0, sizeof(data_frame_opt_t));
  memset((uint8_t *)&tx_clk_option,        0, sizeof(clock_opt_t));
  memset((uint8_t *)&tx_data_frame_option, 0, sizeof(data_frame_opt_t));

  pmc_enable_periph_clk(ID_SSC);
  ssc_reset(SSC);

  rx_clk_option.ul_cks               = SSC_RCMR_CKS_RK;
  rx_clk_option.ul_cko               = SSC_RCMR_CKO_NONE;
  rx_clk_option.ul_cki               = SSC_RCMR_CKI;
  //1 = The data inputs (Data and Frame Sync signals)
  //    are sampled on Receive Clock rising edge. 
  rx_clk_option.ul_ckg               = SSC_RCMR_CKG_NONE; // bylo 0;
  rx_clk_option.ul_start_sel         = SSC_RCMR_START_RF_RISING;
  rx_clk_option.ul_period            = 0;
  rx_clk_option.ul_sttdly            = 1;

  rx_data_frame_option.ul_datlen     = BIT_LEN_PER_CHANNEL - 1;
  rx_data_frame_option.ul_msbf       = SSC_RFMR_MSBF;
  rx_data_frame_option.ul_datnb      = 1;//stereo
  rx_data_frame_option.ul_fslen      = 0;
  rx_data_frame_option.ul_fslen_ext  = 0;
  rx_data_frame_option.ul_fsos       = SSC_RFMR_FSOS_NONE;
  rx_data_frame_option.ul_fsedge     = SSC_RFMR_FSEDGE_POSITIVE;

  ssc_set_receiver(SSC, &rx_clk_option, &rx_data_frame_option);
  ssc_disable_rx(SSC);
//  ssc_disable_interrupt(SSC, 0xFFFFFFFF);

  tx_clk_option.ul_cks               = SSC_TCMR_CKS_RK;
  tx_clk_option.ul_cko               = SSC_TCMR_CKO_NONE;
  tx_clk_option.ul_cki               = 0; //example Atmel. bylo SSC_TCMR_CKI;
  //1 = The data outputs (Data and Frame Sync signals)
  //    are shifted out on Transmit Clock rising edge.
  tx_clk_option.ul_ckg               = SSC_TCMR_CKG_NONE; // bylo 0.
  tx_clk_option.ul_start_sel         = SSC_TCMR_START_RF_RISING;
  tx_clk_option.ul_period            = 0;
  tx_clk_option.ul_sttdly            = 1;

  tx_data_frame_option.ul_datlen     = BIT_LEN_PER_CHANNEL - 1;
  tx_data_frame_option.ul_msbf       = SSC_TFMR_MSBF;
  tx_data_frame_option.ul_datnb      = 1;
  tx_data_frame_option.ul_fslen      = 0; // :fsden=0
  tx_data_frame_option.ul_fslen_ext  = 0;
  tx_data_frame_option.ul_fsos       = SSC_TFMR_FSOS_NONE; //input-slave
  tx_data_frame_option.ul_fsedge     = SSC_TFMR_FSEDGE_POSITIVE;

  ssc_set_transmitter(SSC, &tx_clk_option, &tx_data_frame_option);
  ssc_disable_tx(SSC);
  ssc_disable_interrupt(SSC, 0xFFFFFFFF);
}

void pio_B_SSC(void)
{
// DUE: PA15(B)-D24, PA16(B)-A0, PA14(B)-D23 = DACLRC, DACDAT, BCLK 
  PIOA->PIO_PDR   = PIO_PA14B_TK;
  PIOA->PIO_IDR   = PIO_PA14B_TK;
  PIOA->PIO_ABSR |= PIO_PA14B_TK;
 
  PIOA->PIO_PDR   = PIO_PA15B_TF;
  PIOA->PIO_IDR   = PIO_PA15B_TF;
  PIOA->PIO_ABSR |= PIO_PA15B_TF;

  PIOA->PIO_PDR   = PIO_PA16B_TD;
  PIOA->PIO_IDR   = PIO_PA16B_TD;
  PIOA->PIO_ABSR |= PIO_PA16B_TD;
}
 
The following users thanked this post: daslolo

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
Such clipping approach would work only in textbook example for single signal source of clean sine tone, not real world application where various complex sounds are coming from multiple sources. This application indeed needs DSP processing, thus programming.
He talked about using 9 LEDs ... something which only works under limited circumstances doesn't seem like a huge problem to me. This is clearly more science fair than product.
 

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
Such clipping approach would work only in textbook example for single signal source of clean sine tone, not real world application where various complex sounds are coming from multiple sources. This application indeed needs DSP processing, thus programming.
He talked about using 9 LEDs ...  something which only works under limited circumstances doesn't seem like a huge problem to me. This is clearly more science fair than product.

It does not matter you use 3x3 LEDs or HD display - clipping approach will work only in theory.

Siemens LMS Sound Camera is product indeed:


 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Yes that one. They all use logarithmic spiral. Anyone knows why?
@ogden What's the problem with the clipping approach?
@MasterT good thing the complexity reinforce my decision of going all analog.
By the way, speaking of ADC, are there ADC out there that do mass conversion all at once? Then I wouldn't have to spend time in the mCU and counter-offset the signals.

Anyway I got the FFT running, and having two cores is sweet, so I might as well use programming to make this thing.
https://github.com/laurentopia/M5-Signal-Multimeter
« Last Edit: May 01, 2018, 11:12:50 pm by daslolo »
nine nine nein
 

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
Yes that one. They all use logarithmic spiral. Anyone knows why?

IMHO reason is Marketing. Spiral just looks better than straight beams.

Quote
@ogden What's the problem with the clipping approach?

Using clipping you lose signal amplitude information. Also - even quiet hiss riding on top of sound will add so much noise and "jitter" to phase measurements that w/o averaging it will not work. If you need averaging which more or less is DSP processing - then why don't do proper DSP from very beginning, why reduce resolution to 1bit to later struggle getting it (resolution) back?

Quote
@MasterT good thing the complexity reinforce my decision of going all analog.

I would love to see working, fully analog system :)

Quote
By the way, speaking of ADC, are there ADC out there that do mass conversion all at once? Then I wouldn't have to spend time in the mCU and counter-offset the signals.

There are flash ADCs' which are very fast. Again - processing will add delay. Sound propagation is not instant after all. So it does not actually matter - you find sound direction with 1ms, 2ms or 10ms delay.
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
Yes that one. They all use logarithmic spiral. Anyone knows why?

@MasterT good thing the complexity reinforce my decision of going all analog.
By the way, speaking of ADC, are there ADC out there that do mass conversion all at once? Then I wouldn't have to spend time in the mCU and counter-offset the signals.

Anyway I got the FFT running, and having two cores is sweet, so I might as well use programming to make this thing.
Logarithmic spiral is to impress potential customer, that sound camera is indeed very complex and consequently costly feature. Btw, I should say that I did a sound localizer/ camera twice in the recent past, one with arduino Leonardo and arduino DUE.
You don't have to spend 120 usec or so to wait an adc conversion complete. There are two ways to save time. On the 8-bits atmega controllers you activate adc conv. complete interrupt, and go into interrupt subroutine when sample is ready. It's about 6-10 usec, compare to 120 usec if in waiting state. I'm talking atmega328/atmega32u4 etc.
Arduino DUE has DMA, so you go into interrupt subfunction only once when complete data array is ready, and sampling 1 msps is quite easy to do w/o CPU involvement.   

Nice you have fft running, now you should check the timing, if you can run fft on stream of data (4 mics - 10k *4 = 40 ksps or so).  There are some optimization may be required. Arduino DUE for example iis capable to process data via fft with 250-300 ksps, and optimization is not requred even to HI-FI 48kHz *4 =< 200 ksps sound.  But I did optimize a processing on arduino Leonardo, interleaving processing for left /right than top/ bottom mics, as Leonardo is good for about 20 ksps. 
 

Offline Sparker

  • Regular Contributor
  • *
  • Posts: 56
  • Country: ru
Cross correlation is the same thing as DFT. Has very little or zero practical value, since FFT about thousands time faster.
I think we can indeed get the cross correlation of two signals from their DFTs and thus save time, you are right. What I mean is that getting the cross correlation by any means should be the main method here, because it can directly tell us the time difference between two signals. Or not? If not then I don't understand how the FFT approach should work.  :-//

By making clipping you are essentially making non-linear distortion to your signal. Do an experiment: observe the spectrum of one sinewave and observe the spectrum of a clipped sinewave. Now do the same thing with two sinewaves. You should notice that peaks with frequencies w1 and w2 have multiplied, and you have peaks at frequencies like w1+w2, 2*w1+w2 and so on.
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
Cross correlation is the same thing as DFT. Has very little or zero practical value, since FFT about thousands time faster.
I think we can indeed get the cross correlation of two signals from their DFTs and thus save time, you are right. What I mean is that getting the cross correlation by any means should be the main method here, because it can directly tell us the time difference between two signals. Or not? If not then I don't understand how the FFT approach should work.  :-//
Right, cross-correlation outputs the time difference. FFT outputs the phase difference, but since we know the frequency & speed of sound in the air, it's a piece of cake to translate phase to time and back, whatever is more practical for final result. In direction finding an Angle to sound source is what has to be determined, though time difference or phase must be translated to angle, based on distance between two mic's.
 Second reason against cross-correlation is that it's run summary report for all bandwidth. If there are two and more sound sources ( always in real world) CC gives meaningless data. Same time FFT could easily distinguish 100's sound sources , as long as they don't overlap in bandwidth, or not completely overlap. Cutting 78 Hz slice out of 0-10 kHz band  , sorting out sound patterns each of noise sources you could identify narrow band that is specific to each of them, and find a right direction..
Same apply to standing waves, reverberation, echoes especially in indoor environment, only FFT is capable to sort out and throw away a part of data pull, that is most distorted/ corrupted, and still resolve a trigonometry equation..
 

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
Cross correlation is the same thing as DFT. Has very little or zero practical value, since FFT about thousands time faster.
FFT is a form of DFT, discrete cross correlation can be accelerated with FFT. Preferably a real-FFT so you don't have to combine signals for efficiency.
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
Preferably a real-FFT so you don't have to combine signals for efficiency.
What does that mean, real-FFT? Is there non-real part as well exist? And "combine signal" is a new  mathematics term? Sorry for my  limited vocabulary, I know + - * /, never heard combine.
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Yes that one. They all use logarithmic spiral. Anyone knows why?

@MasterT good thing the complexity reinforce my decision of going all analog.
By the way, speaking of ADC, are there ADC out there that do mass conversion all at once? Then I wouldn't have to spend time in the mCU and counter-offset the signals.

Anyway I got the FFT running, and having two cores is sweet, so I might as well use programming to make this thing.
Logarithmic spiral is to impress potential customer, that sound camera is indeed very complex and consequently costly feature. Btw, I should say that I did a sound localizer/ camera twice in the recent past, one with arduino Leonardo and arduino DUE.
You don't have to spend 120 usec or so to wait an adc conversion complete. There are two ways to save time. On the 8-bits atmega controllers you activate adc conv. complete interrupt, and go into interrupt subroutine when sample is ready. It's about 6-10 usec, compare to 120 usec if in waiting state. I'm talking atmega328/atmega32u4 etc.
Arduino DUE has DMA, so you go into interrupt subfunction only once when complete data array is ready, and sampling 1 msps is quite easy to do w/o CPU involvement.   

Nice you have fft running, now you should check the timing, if you can run fft on stream of data (4 mics - 10k *4 = 40 ksps or so).  There are some optimization may be required. Arduino DUE for example iis capable to process data via fft with 250-300 ksps, and optimization is not requred even to HI-FI 48kHz *4 =< 200 ksps sound.  But I did optimize a processing on arduino Leonardo, interleaving processing for left /right than top/ bottom mics, as Leonardo is good for about 20 ksps.

The Spiral does look scienty and the housing looks desirably expensive.
I'd love to see your sound camera in action, do you have a link of your project? I didn't know the arduino boards could push 1msps! is this per channel? Through the arduino dev IDE or are you poking at registers directly?

I don't know how to calculate sps but maybe you're talking about the sampling frequency. In that case I'm sampling at 10khz.
As for the FFT timing, on the esp32, one 512 bin FFT takes 14ms, the capture takes 10ms @ 10khz, 40Khz is the sampling limit using a delayed loop of analogRead.
Someone bypassed the register safety lock and managed 25 MHZ output so fast input sampling must be possible but I must say I didn't understand it yet how it's done and what the impact on using a second core for capturing (what I'm doing) is.
I haven't used DMA but I read that the esp32 has DMA as well, I don't know if this is the same access as your DUE though.
nine nine nein
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Making a sound camera
« Reply #31 on: May 02, 2018, 04:52:36 am »
The spiral looks like the one used as the core of the Square Kilometer Array.



https://www.skatelescope.org/layout/

Quote
The spiral layout design has been chosen after detailed study by scientists into how best optimise the configuration to get the best possible results.

This spiral configuration gives many different lengths (baselines) and angles between antennas resulting in very high-resolution imaging capability.

The perfect layout would be a random arrangement that maximises the number of different baselines and angles between antennas. However, the practicalities of construction as well as linking the antennas together with cables mean that the spiral configuration is the best trade off between image resolution and cost.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 
The following users thanked this post: ogden

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13748
  • Country: gb
    • Mike's Electric Stuff
Re: Making a sound camera
« Reply #32 on: May 02, 2018, 07:05:31 am »
Someone posted here a while ago that they had done something similar They used digital-output MEMS microphone modules and an FPGA, to make a pretty cheap system
I don't recall how far they got in producing visual output.
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13748
  • Country: gb
    • Mike's Electric Stuff
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13748
  • Country: gb
    • Mike's Electric Stuff
Yes that one. They all use logarithmic spiral. Anyone knows why?
My guess would be something to do with standing waves and/or having a large number of different mic-to-mic distances while maintaining symmetry of sensitivity

Digital-output MEMS mics with I2S or pulse-modulation schemes are very cheap, so it may make sense to trade off number of mics versus processing complexity

Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
What does that mean, real-FFT? Is there non-real part as well exist? And "combine signal" is a new  mathematics term? Sorry for my  limited vocabulary, I know + - * /, never heard combine.
Most FFT implementations are complex<->complex, what we are generally interested in in DSP is a real->complex FFT and a complex->real iFFT. You can use a complex FFT to do two real-FFTs, but it's a headache.
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
I'd love to see your sound camera in action, do you have a link of your project? I didn't know the arduino boards could push 1msps! is this per channel? Through the arduino dev IDE or are you poking at registers directly?

I don't know how to calculate sps but maybe you're talking about the sampling frequency. In that case I'm sampling at 10khz.
As for the FFT timing, on the esp32, one 512 bin FFT takes 14ms, the capture takes 10ms @ 10khz, 40Khz is the sampling limit using a delayed loop of analogRead.
Someone bypassed the register safety lock and managed 25 MHZ output so fast input sampling must be possible but I must say I didn't understand it yet how it's done and what the impact on using a second core for capturing (what I'm doing) is.
I haven't used DMA but I read that the esp32 has DMA as well, I don't know if this is the same access as your DUE though.
I don't have a video, it's lost. Visualization was done on android tablet, I used BT to transfer processed/ filtered and sorted data.  1 mega sample per seconds via direct registers programming, arduino IDE is sloppy. One adc, 1 msps per all channels.

Stream sample rate  only makes sense with real-time data processing, when adc conversion is going in background  continuously, using dma or interrupts. FFT should be computed faster than sampling,  14 msec with 512 fft means you theoretically could get 36571 Hz sampling, not counting overhead for data management.

What does that mean, real-FFT? Is there non-real part as well exist? And "combine signal" is a new  mathematics term? Sorry for my  limited vocabulary, I know + - * /, never heard combine.
Most FFT implementations are complex<->complex, what we are generally interested in in DSP is a real->complex FFT and a complex->real iFFT. You can use a complex FFT to do two real-FFTs, but it's a headache.
It's an old myth, that we could save time not doing math on the empty imaginary part of the input data. I did some research, and it's turns out that cpu clock savings comes with only half frequency resolution, so all this theory to define real->FFT is complete BS. There are many proved optimization technics, that doesn't sacrifice frequency resolution, like using higher Radix or Split-radix, but all depends on specific uCPU and it's instructions set, availability MAC, MULT vs ADD performance etc. 
 

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
It's an old myth
No, multiplying by 0 is always a waste of time.
Quote
and it's turns out that cpu clock savings comes with only half frequency resolution
If you fill the imaginary input of a complex FFT with 0s, the output is symmetric ... that the index for the frequency bins counts up higher is irrelevant, half the information is completely redundant.

A real-FFT goes up to Nyquist, just like the real DFT, that's obviously sufficient.
« Last Edit: May 02, 2018, 01:29:42 pm by Marco »
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
It's an old myth
No, multiplying by 0 is always a waste of time.
Quote
and it's turns out that cpu clock savings comes with only half frequency resolution
If you fill the imaginary input of a complex FFT with 0s, the output is symmetric ... that the index for the frequency bins counts up higher is irrelevant, half the information is completely redundant.
.
Show your code, than talk.
 

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
Re: Making a sound camera
« Reply #39 on: May 02, 2018, 02:16:56 pm »
NOU.
 

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
If you fill the imaginary input of a complex FFT with 0s, the output is symmetric ... that the index for the frequency bins counts up higher is irrelevant, half the information is completely redundant.
.
Show your code, than talk.

Code can be 3rd party as well :)

https://www.keil.com/pack/doc/CMSIS/DSP/html/group__RealFFT.html

Most likely you are interested in just DSP package:

https://github.com/ARM-software/CMSIS_5/releases/tag/5.3.0
 

Offline Sparker

  • Regular Contributor
  • *
  • Posts: 56
  • Country: ru
Right, cross-correlation outputs the time difference. FFT outputs the phase difference, but since we know the frequency & speed of sound in the air, it's a piece of cake to translate phase to time and back, whatever is more practical for final result. In direction finding an Angle to sound source is what has to be determined, though time difference or phase must be translated to angle, based on distance between two mic's.
 Second reason against cross-correlation is that it's run summary report for all bandwidth. If there are two and more sound sources ( always in real world) CC gives meaningless data. Same time FFT could easily distinguish 100's sound sources , as long as they don't overlap in bandwidth, or not completely overlap. Cutting 78 Hz slice out of 0-10 kHz band  , sorting out sound patterns each of noise sources you could identify narrow band that is specific to each of them, and find a right direction..
Same apply to standing waves, reverberation, echoes especially in indoor environment, only FFT is capable to sort out and throw away a part of data pull, that is most distorted/ corrupted, and still resolve a trigonometry equation..
I've just tested the cross-correlation in matlab and it could actually work nice with multiple noise-like signals with narrow self-correlation and low cross-correlations. But the method works bad with human speech because it's not as random as I initially assumed.  :(
DFT is the choice here indeed.
What I don't understand is, if both sound sources occupy exactly the same frequency, like the car control panel in the video in this thread, how can the system differentiate them?
 

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
Re: Making a sound camera
« Reply #42 on: May 02, 2018, 04:58:36 pm »
It's all fine and well to say "use the DFT", but it doesn't really mean anything. A bunch of phase differences between microphones at given frequencies aren't that easy to convert to delay, the phases are modulo 2pi after all.

If you want to pick out part of the spectrum, just multiply the FFT transformed microphone signals with a Fourier domain representation of a minimum phase bandpass filter, before multiplying them with each other and doing the inverse FFT. It's still cross correlation, just of band limited versions of the microphone signals.

PS. don't forget to zero pad the microphone signals before doing the FFT.
« Last Edit: May 02, 2018, 05:05:57 pm by Marco »
 

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
It's crappy Radix-2, that is good for undeveloped third world  tribes numba-umba.

Here we go 8)

Quote
And for whoever picking any BS posted on wiki pages, and than posting on this thread, that was started by OP who acknowledged his lack of software skills in first message. I don't see a point to continue this dispute here, should we start another fft related thread?.

Yes, please. Share your wizdom. I am especially interested if discussion can result in faster and/or smaller Cortex-M optimized complex FFT code than that in CMSIS DSP lib.
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Re: Making a sound camera
« Reply #45 on: May 04, 2018, 12:05:38 am »
The spiral looks like the one used as the core of the Square Kilometer Array.



https://www.skatelescope.org/layout/

Quote
The spiral layout design has been chosen after detailed study by scientists into how best optimise the configuration to get the best possible results.

This spiral configuration gives many different lengths (baselines) and angles between antennas resulting in very high-resolution imaging capability.

The perfect layout would be a random arrangement that maximises the number of different baselines and angles between antennas. However, the practicalities of construction as well as linking the antennas together with cables mean that the spiral configuration is the best trade off between image resolution and cost.
OK good to see there is a very good reason. Reminds me of what I read about Viktor Schauberger years ago. I wonder what else we can lift from his eminent work on water.

Yes that one. They all use logarithmic spiral. Anyone knows why?
My guess would be something to do with standing waves and/or having a large number of different mic-to-mic distances while maintaining symmetry of sensitivity

Digital-output MEMS mics with I2S or pulse-modulation schemes are very cheap, so it may make sense to trade off number of mics versus processing complexity

What do you mean it has to do with  standing wave?
Also can you explain trade off number of mics vs processing complexity? Seems to me that more mic = more FFT to churn

You guys lost me at radix until I saw its synonimous with "base", I'll ask why that matters in your other thread, it's interesting.

Can anyone give me an explanation of why that jambalaya with FFT will give me inter-mic delays? I understand that frequency domain graph will give me a sort of instant signature of the sound but I don't understand how that'll detect that one sound arrived a few ms on this mic vs that mic...
nine nine nein
 

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
Re: Making a sound camera
« Reply #46 on: May 04, 2018, 12:33:19 am »
Can anyone give me an explanation of why that jambalaya with FFT will give me inter-mic delays?

Output of complex FFT is array of complex numbers which means that you get not only magnitude but phase information for each frequency bin of FFT as well. Using simple trigonometry you can calculate phase angle difference between bins of two FFT's. Knowing frequency of the bin and phase angle difference, you calculate signal delay time. This of course is not straightforward operation because at high frequencies phase angle difference most likely will exceed 360 degrees :)

[edit] Real FFT is just spectrum analyzer. Complex FFT is more than that.

Quote
I understand that frequency domain graph will give me a sort of instant signature of the sound but I don't understand how that'll detect that one sound arrived a few ms on this mic vs that mic...

During just 1 ms sound travels whooping 0.343m. Your mic array shall be gigantic to get few ms delay between mics :)
« Last Edit: May 04, 2018, 12:46:05 am by ogden »
 

Online MasterT

  • Frequent Contributor
  • **
  • Posts: 785
  • Country: ca
Re: Making a sound camera
« Reply #47 on: May 04, 2018, 01:05:24 am »

1. Also can you explain trade off number of mics vs processing complexity? Seems to me that more mic = more FFT to churn

2. You guys lost me at radix until I saw its synonimous with "base", I'll ask why that matters in your other thread, it's interesting.

3. Can anyone give me an explanation of why that jambalaya with FFT will give me inter-mic delays? I understand that frequency domain graph will give me a sort of instant signature of the sound but I don't understand how that'll detect that one sound arrived a few ms on this mic vs that mic...

1. Correct. Use your analytical skills to sort out rubbish.
2 & 3. Take a crash course http://www.dspguide.com/ch8/8.htm
Equation 8.6. is the key to find a phase.
 
You didn't provide a link to the fft library you are using, if output data set has Re[] and Im[],  than you know what to do.
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Re: Making a sound camera
« Reply #48 on: May 04, 2018, 01:43:52 am »
Alright I returned all my electronic gizmos, going back to basics, I fired up matlab and got this:

Code: [Select]
x=.5*cos(2*pi*20*t+20*pi/180) +.1*cos(2*pi*54*t+-60*pi/180);
x2=.5*cos(2*pi*30*t+20*pi/180) +.1*cos(2*pi*54*t+-50*pi/180);
X = 1/N*fftshift(fft(x,N));
X2 = 1/N*fftshift(fft(x2,N));
%phase thingy
phase=atan2(imag(X),real(X))*180/pi;
phase2=atan2(imag(X2),real(X2))*180/pi;
plot(f,phase-phase2);
which makes 2 curves x and x2 offset by phase of 10 (10 what?) and after calculating the phase and phase2 I get this funky curve when substracting both phases... I was expecting to get that delta of 10 but maybe there is more massaging I need to do  :-DD


nine nine nein
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Re: Making a sound camera
« Reply #49 on: May 04, 2018, 01:53:19 am »
You didn't provide a link to the fft library you are using, if output data set has Re[] and Im[],  than you know what to do.
https://github.com/kosme/arduinoFFT
nine nine nein
 

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
Re: Making a sound camera
« Reply #50 on: May 04, 2018, 02:20:05 am »
FFT is mostly useful as a fast way to implement convolution, and thus cross correlation. I don't know why cross correlation didn't work for you in matlab ... but it should, look at the example in their documentation. As you noticed and as I said just having a bunch of different phases for the different microphones doesn't give you much useful information, the cross correlation gives you something useful in the time domain. As I also said, when using FFT to implement cross correlation you have to zero pad the signals to twice the length (because FFT based convolution is circular).
« Last Edit: May 04, 2018, 02:21:54 am by Marco »
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Making a sound camera
« Reply #51 on: May 04, 2018, 03:02:43 am »
Can anyone give me an explanation of why that jambalaya with FFT will give me inter-mic delays? I understand that frequency domain graph will give me a sort of instant signature of the sound but I don't understand how that'll detect that one sound arrived a few ms on this mic vs that mic...

A short answer...

If you multiply the samples in one signal by the other, then integrate over a number
of samples you can get a measure of how 'alike' to signals are.

If you want to match signals at different offsets, you have to do this at each offset.
The result is a time series that shows how much the signals match at different alignments.

Testing a 16k sample signature in a set of 16k samples at each alignment will require
268,435,456 multiply-add operations. That is a lot of math. Or O(N^2) if you like big-O complexity notation.

FFT is an effective way to do this, but with less work.

You split the signature and test data into their frequency and phase components using FFT.

You multiply the components in one signature by the ones in the other. If the frequency
is only in one of them, it will be removed (as zero times anything = zero). If the frequency is in both, they will be enhanced (and the phase changed).

You are then left with the set of amplitudes that indicate which frequencies that are common to both signals, with some phase information (remember phase information = timing information)

By running the Inverse FFT on this result, where things are 'in sync' the magnitudes of the various components add up, where they are 'out of sync' on average they cancel out.

The result is a time series that shows how much the signals match at different alignments
 - exactly same numbers (except rounding!) as found by the hard method.

However, because FFT and IFFT are each an O(N*logN) complexity this is far more practical for large values of N.

So for 16k of samples, the hard way takes 268,435,456 units of work. The FFT way takes 2*16k*4.2 = 1,101,004 units of work (however the size of a unit of work may be different between the two processes, so you can't just say it will be 256 times faster).

This also brings up another point. To get good timing information from such a process, your signature must have a good mix of frequencies in it - so a sine wave will match to another sine wave of the same frequency when they are in phase, giving ambiguous results. The signature must also have structure to it - one bit of pure random noise is much like any other bit of random noise.

This is one of the reason that dolphins use 'clicks' for echolocation and radar uses 'chirps'. This technique will not allow you to accurately pinpoint the location of that annoying hum, unless you have lots of microphones to allow you to remove the ambiguity.
« Last Edit: May 09, 2018, 04:39:01 am by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Re: Making a sound camera
« Reply #52 on: May 08, 2018, 03:51:58 am »
Alright @hamster_nz, I understand that the sound needs harmonic for the FFT to do a match, fortunately, hums have harmonics, that is what bothers (something to do with the brain latching on the pattern). Thanks to your explanation I now understand that multiplying the spectrum of the various mic would give me the matching frequencies.
I don't understand how iFFT would give me offset between signals, which I think is what I need to triangulate the source.
I don't understand neither that thing you say about multiplying phase so I tried doing it in matlab. Phase edlta is 10 degrees between x and x2, the sine wave is at 10Hz so the offset between x and x2 is 1/10*10/360=0.027s (right?)
This graph shows 2, so I'm doing something wrong somewhere.

Code: [Select]
fc=10;
fs=32*fc;
t=0:1/fs:2-1/fs;
x=.5*cos(2*pi*20*t+20*pi/180) +.1*cos(2*pi*54*t+-60*pi/180) ;%time domain signal with phase shift
x2=.5*cos(2*pi*20*t+30*pi/180) +.1*cos(2*pi*54*t+-50*pi/180) ;
N=256; %FFT size
X = 1/N*fftshift(fft(x,N));%N-point complex DFT
X2 = 1/N*fftshift(fft(x2,N));%N-point complex DFT

%phase delta
X3=X.*X2;
phase3=atan2(imag(X3),real(X3))*180/pi; %phase
df=fs/N; %frequency resolution
sampleIndex = -N/2:N/2-1; %ordered index for FFT plot
f=sampleIndex*df; %x-axis index converted to ordered frequencies
plot(f,phase3.*real(X3));

« Last Edit: May 08, 2018, 04:06:37 am by daslolo »
nine nine nein
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Making a sound camera
« Reply #53 on: May 08, 2018, 12:03:18 pm »
Sorry for the large swath of code, but it is everything (although using DFT not FFT)

Code: [Select]
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define POINTS   (128)
#define FFT_SIZE (POINTS/2+1)
#define PI       (3.1415926535897932384626433)

double samples1[POINTS],       samples2[POINTS], samples1flipped[POINTS];
double spectrum1_r[FFT_SIZE], spectrum1_i[FFT_SIZE];
double spectrum2_r[FFT_SIZE], spectrum2_i[FFT_SIZE];
double spectrumX_r[FFT_SIZE], spectrumX_i[FFT_SIZE];
double samplesX[POINTS];


/**********************************************************************/
void generate_signature(double *samples) {
   int i;
   for(i = 0; i < POINTS; i++)
      samples[i] = 0;

   for(i =  0; i <  5; i++) samples[i] = -1.0;
   for(i =  5; i < 10; i++) samples[i] = 1.0;
   for(i = 10; i < 15; i++) samples[i] = -1.0;
   for(i = 15; i < 20; i++) samples[i] = 1.0;
}

/**********************************************************************/
void flip_samples(double *s1, double *s2) {
   int i;
   for(i = 0; i < POINTS; i++)
     s2[i] = s1[POINTS-1-i];
}
/**********************************************************************/
void offset_sample(double *samples1, double *samples2, int offset) {
   int i;
   for(i = 0; i < POINTS; i++)
      samples2[i] = samples1[(i-offset)%POINTS];
}

/**********************************************************************/
void complex_mult(double *r1, double *i1,
                  double *r2, double *i2,
                  double *r3, double *i3) {
   int i;
   for(i = 0; i < FFT_SIZE; i++) {
      r3[i] = (r1[i] * r2[i]) - (i1[i] * i2[i]);
      i3[i] = (r1[i] * i2[i]) + (r2[i] * i1[i]);
   }
}

/**********************************************************************/
void add_noise(double *samples, double level) {
   int i;
   for(i = 0; i < POINTS; i++) {
      samples[i] += (rand()*level)/RAND_MAX;
   }
}

/**********************************************************************/
void print_max(double *samples) {
   int    max_i = 0;
   double max = samples[0];
   int i;
   for(i = 1; i < POINTS; i++) {
      if(samples[i] > max) {
        max_i = i;
        max   = samples[i];
      }
   }

   if(max_i >= POINTS/2)
      max_i -= POINTS;
   printf("Max is at %i (%10.4f)\n", max_i, max);
}

/**********************************************************************/
void dft(double *samples, double *r, double *i) {
  int b,p;
  for(b = 0; b < FFT_SIZE; b++)
  {
    r[b] = 0.0;
    i[b] = 0.0;
    for(p = 0; p <POINTS; p++)
    {
      double angle = 2*PI*b*p/POINTS;
      r[b] += samples[p] * cos(angle);
      i[b] -= samples[p] * sin(angle);
    }
  }
}

/******************************************************/
void idft(double *r, double *i, double *samples) {
  int b,p;
  double rs[FFT_SIZE];
  double is[FFT_SIZE];

  for(b = 0; b < FFT_SIZE; b++) {
     rs[b] = r[b]/(POINTS/2);
     is[b] = -i[b]/(POINTS/2);
  }
  rs[0]          = r[0]/POINTS;
  rs[FFT_SIZE-1] = r[FFT_SIZE-1]/POINTS;

  for(p = 0; p < POINTS; p++)
  {
    for(b = 0; b < FFT_SIZE; b++)
    {
      double angle = 2*PI*b*p/POINTS;
      samples[p] += (rs[b] * cos(angle)) + (is[b] * sin(angle));
    }
  }
}
/******************************************************/
int main(int argc, char *argv[]) {
   int i;
   /* Prepare the data */
   generate_signature(samples1);
   offset_sample(samples1,samples2, 50);
   add_noise(samples2, 0.8);

   /* Flip the samples to turn convolution it to correlation */
   flip_samples(samples1,samples1flipped);

   /* Take the DFT */
   dft(samples1flipped,spectrum1_r,spectrum1_i);
   dft(samples2,spectrum2_r,spectrum2_i);

   /* Multiply the spectrums */
   complex_mult(spectrum2_r, spectrum2_i,
                spectrum1_r, spectrum1_i,
                spectrumX_r, spectrumX_i);

   /* Inverse DFT to find how well they match */
   idft(spectrumX_r,spectrumX_i,samplesX);

   /* Output the results */
   for(i = 0; i < POINTS; i++) {
      printf("%4i, %10.7f, %10.7f, %10.7f\n",i, samples1[i], samples2[i], samplesX[i]);
   }
   print_max(samplesX);
   return 0;
}


In the picture below.

- The first graph is the 'signature' being looked for.

- The second is the data to look for the signature in

- The thrid graph is the output of the FFT correlation - the peak is when the how much the second signal is like the first.  Note the peak is at the offset you need to align to.

I will be the first to admit that this is just a hack, you will need to develop and test it...
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 
The following users thanked this post: daslolo

Online Marco

  • Super Contributor
  • ***
  • Posts: 6723
  • Country: nl
Re: Making a sound camera
« Reply #54 on: May 08, 2018, 01:36:20 pm »
The signature must also have structure to it - one bit of pure random noise is much like any other bit of random noise.

It's the opposite, each bit of noise is unique. A white noise source will give a really clear spike when you cross correlate two microphones receiving it.
 

Offline ebastler

  • Super Contributor
  • ***
  • Posts: 6510
  • Country: de
Anyway I got the FFT running, and having two cores is sweet, so I might as well use programming to make this thing.
https://github.com/laurentopia/M5-Signal-Multimeter

What's the hardware in that picture? (The little screen plus processor device to the left?) Looks neat!
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Making a sound camera
« Reply #56 on: May 09, 2018, 04:38:10 am »
The signature must also have structure to it - one bit of pure random noise is much like any other bit of random noise.

It's the opposite, each bit of noise is unique. A white noise source will give a really clear spike when you cross correlate two microphones receiving it.

Yes, you are right.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Re: Making a sound camera
« Reply #57 on: May 09, 2018, 05:31:24 am »
Anyway I got the FFT running, and having two cores is sweet, so I might as well use programming to make this thing.
https://github.com/laurentopia/M5-Signal-Multimeter

What's the hardware in that picture? (The little screen plus processor device to the left?) Looks neat!
M5Stack
nine nine nein
 
The following users thanked this post: ebastler

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Re: Making a sound camera
« Reply #58 on: May 09, 2018, 11:18:14 pm »
I will be the first to admit that this is just a hack, you will need to develop and test it...

Thank you so much: it does what I need. I like that you didn't bother with FFT and I never realized how slow DFT is until now :)
Why is the FFT size half the sample size?
In term of cpu cost, counting 5 microphones will this need 5!=120x iFFT + 5x FFT + 5x flipped FFT?
You flip X1 before doing a complex mult, is this the equivalent to a convolution? If so, could the entire thing be done with applying a NxN kernel to both signal?

If anyone wants to try it real quick on their mCPU, here is the arduino code:

Code: [Select]
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "Adafruit_GFX.h"// Hardware-specific library
#include <MCUFRIEND_kbv.h>
MCUFRIEND_kbv tft;

#define POINTS   (128)
#define FFT_SIZE (POINTS/2+1)
#define PI       (3.1415926535897932384626433)

double samples1[POINTS], samples2[POINTS], samples1flipped[POINTS];
double spectrum1_r[FFT_SIZE], spectrum1_i[FFT_SIZE];
double spectrum2_r[FFT_SIZE], spectrum2_i[FFT_SIZE];
double spectrumX_r[FFT_SIZE], spectrumX_i[FFT_SIZE];
double samplesX[POINTS];


/**********************************************************************/
void generate_signature(double *samples)
{
int i;
for (i = 0; i < POINTS; i++)
samples[i] = 0;

for (i = 0; i < 5; i++) samples[i] = -1.0;
for (i = 5; i < 10; i++) samples[i] = 1.0;
for (i = 10; i < 15; i++) samples[i] = -1.0;
for (i = 15; i < 20; i++) samples[i] = 1.0;
}

/**********************************************************************/
void flip_samples(double *s1, double *s2)
{
int i;
for (i = 0; i < POINTS; i++)
s2[i] = s1[POINTS - 1 - i];
}
/**********************************************************************/
void offset_sample(double *samples1, double *samples2, int offset)
{
int i;
for (i = 0; i < POINTS; i++)
samples2[i] = samples1[(i - offset) % POINTS];
}

/**********************************************************************/
void complex_mult(double *r1, double *i1,
double *r2, double *i2,
double *r3, double *i3)
{
int i;
for (i = 0; i < FFT_SIZE; i++)
{
r3[i] = (r1[i] * r2[i]) - (i1[i] * i2[i]);
i3[i] = (r1[i] * i2[i]) + (r2[i] * i1[i]);
}
}

/**********************************************************************/
void add_noise(double *samples, double level)
{
int i;
for (i = 0; i < POINTS; i++)
{
samples[i] += (rand()*level) / RAND_MAX;
}
}

/**********************************************************************/
void print_max(double *samples)
{
int    max_i = 0;
double max = samples[0];
int i;
for (i = 1; i < POINTS; i++)
{
if (samples[i] > max)
{
max_i = i;
max = samples[i];
}
}

if (max_i >= POINTS / 2)
max_i -= POINTS;
printf("Max is at %i (%10.4f)\n", max_i, max);
 tft.drawLine(max_i, 50, max_i,150, 0b111111111100000);
tft.setCursor(0, 170);
tft.print("Offset = ");
 tft.print(max_i);
}

/**********************************************************************/
void dft(double *samples, double *r, double *i)
{
int b, p;
for (b = 0; b < FFT_SIZE; b++)
{
r[b] = 0.0;
i[b] = 0.0;
for (p = 0; p < POINTS; p++)
{
double angle = 2 * PI*b*p / POINTS;
r[b] += samples[p] * cos(angle);
i[b] -= samples[p] * sin(angle);
}
}
}

/******************************************************/
void idft(double *r, double *i, double *samples)
{
int b, p;
double rs[FFT_SIZE];
double is[FFT_SIZE];

for (b = 0; b < FFT_SIZE; b++)
{
rs[b] = r[b] / (POINTS / 2);
is[b] = -i[b] / (POINTS / 2);
}
rs[0] = r[0] / POINTS;
rs[FFT_SIZE - 1] = r[FFT_SIZE - 1] / POINTS;

for (p = 0; p < POINTS; p++)
{
for (b = 0; b < FFT_SIZE; b++)
{
double angle = 2 * PI*b*p / POINTS;
samples[p] += (rs[b] * cos(angle)) + (is[b] * sin(angle));
}
}
}
/******************************************************/
void setup(void)
{
Serial.begin(115200);
  uint16_t ID = tft.readID();
  tft.begin(ID);
  tft.setRotation(1);
  tft.fillScreen(0x0000);

int i;
/* Prepare the data */
generate_signature(samples1);
offset_sample(samples1, samples2, 50);
add_noise(samples2, 0.8);

 double timer = micros();

/* Flip the samples to turn convolution it to correlation */
flip_samples(samples1, samples1flipped);

/* Take the DFT */
dft(samples1flipped, spectrum1_r, spectrum1_i);
dft(samples2, spectrum2_r, spectrum2_i);

/* Multiply the spectrums */
complex_mult(spectrum2_r, spectrum2_i,
spectrum1_r, spectrum1_i,
spectrumX_r, spectrumX_i);

/* Inverse DFT to find how well they match */
idft(spectrumX_r, spectrumX_i, samplesX);

tft.setCursor(0,0);
  tft.println(micros()-timer);

for (i = 0; i < POINTS; i++)
{
tft.drawLine(i, 100 - 10 * samples1[i - 1], i, 100 - 10 * samples1[i], 0b0000000000011111);
tft.drawLine(i, 100 - 10 * samples2[i - 1], i, 100 - 10 * samples2[i], 0b0000010000011111);
tft.drawLine(i, 100 - 2 * samplesX[i - 1], i, 100 - 2 * samplesX[i],   0b1111100000000000);
}
print_max(samplesX);
}

void loop(void) {}
« Last Edit: May 10, 2018, 12:06:04 am by daslolo »
nine nine nein
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Making a sound camera
« Reply #59 on: May 10, 2018, 04:19:41 am »
Thank you so much: it does what I need. I like that you didn't bother with FFT and I never realized how slow DFT is until now :)

You could easily make it a bit faster by not calling SIN() and COS() so much, you can make a lookup table but it will cost you in memory. That way sin() and cos() become something like:

sin = sin_table[(index*band)%table_size];
cos = sin_table[(table_size/4+index*band)%table_size];

You have to 'flip' (actually mirror in time) the needle that you are looking for in the haystack.  My code is slightly wrong, and gives you a peak with an offset that is off by one - item 0 should not move, but items[1] & items[n-1] should swap and so on.

I am not sure you will need all 120 IFFTs. You may only need to do a handful to get enough information that you can locate the source.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline dasloloTopic starter

  • Regular Contributor
  • *
  • Posts: 63
  • Country: fr
  • I saw wifi signal once
Re: Making a sound camera
« Reply #60 on: May 11, 2018, 01:30:52 am »
Oh yeah, just 20x faster  ;D
When I get the raspberry pi tomorrow I'll get to work, for real.
Anyone using bare metal things like https://ultibo.org/?
nine nine nein
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf