Author Topic: The Daredevil Camera - seeing the world in sound  (Read 9640 times)

0 Members and 1 Guest are viewing this topic.

Offline Brutte

  • Frequent Contributor
  • **
  • Posts: 614
Re: The Daredevil Camera - seeing the world in sound
« Reply #25 on: June 30, 2016, 07:58:53 pm »
Here's another application - monitor the audience at quiet concerts and point a STFU gobo-light at the inconsiderate gits who insist on talking through the music.
I'd wire the eChairs(TM) into 2D rows and columns. The sound monitor would just govern row and column multiplexer  >:D
 

Online Someone

  • Super Contributor
  • ***
  • Posts: 5375
  • Country: au
    • send complaints here
Re: The Daredevil Camera - seeing the world in sound
« Reply #26 on: June 30, 2016, 09:30:30 pm »
Did a quick sim, here's the PSF for a dead-ahead source with a 16x16 grid:
Hello, would you share your simulation sources? It would be interesting to see the maths and how they compare to say a dct approach.
 

Offline rs20

  • Super Contributor
  • ***
  • Posts: 2322
  • Country: au
Re: The Daredevil Camera - seeing the world in sound
« Reply #27 on: June 30, 2016, 10:31:39 pm »
Hello, would you share your simulation sources? It would be interesting to see the maths and how they compare to say a dct approach.

Unfortunately this code is a MATLAB(/GNU Octave) special; dense and fun to write but not exactly the epitome of readability. Here it is, with no guarantee of intelligibility.

Code: [Select]
% 256 microphone positions, arranged in 16x16 (expressed as complex
% numbers).
array = kron(1:16, ones(16,1)) + 1i * kron((1:16)', ones(1,16)) - (8.5+8.5i);
array = array(:);

% OR, 256 microphones in random places within the same square area.
%array = (15 * rand(256, 1) + 15i * rand(256,1)) - (7.5 + 7.5i);

% X & Y coordinates spread over 640x480 image.
x = kron(-319.5:320,ones(480,1));
y = kron((-239.5:240)',ones(1,640));

% This variable will accumulate the amplitude and phase of the waves to
% each microphone, for every pixel and frequency in the image (3 slightly
% different frequencies are tracked, to give colour to the image).
s = zeros(480, 640, 3);

% For each microphone in the array,
for pos = array'
    % Measure the distance from the microphone to each pixel.
    distance = real(conj(pos) * (x + 1i * y));
   
    % Convert the distance into a phase (in radians? :P) for three
    % different fixed frequencies.
    phase = cat(3, 0.1 * distance, 0.105 * distance, 0.11 * distance);
   
    % Convert the phase to a phasor, and accumulate in s.
    s = s + exp(1i * phase);
end

% Display amplitude at each pixel, with sRGB gamma compensation (2.2).
imshow((abs(s) / 242).^2.2)
 

Offline rs20

  • Super Contributor
  • ***
  • Posts: 2322
  • Country: au
Re: The Daredevil Camera - seeing the world in sound
« Reply #28 on: June 30, 2016, 11:02:15 pm »
Regarding interface, I have a crazy but fun (to me) thought. I'm thinking that if you have a 44.1kHz sampling rate, and you divide the world in 1/60 second chunks, then in every chunk you have 44.1k / 60 = 735 samples, and 256 channels. This means your data could be presented to the computer as a webcam providing 735x256 "pixel" "images" at 60 "f"ps. Obviously you wouldn't want JPEG compression on those "images", nor would this get around the need for USB3. However, if you can find a USB3 webcam interface chip, then you're golden, and writing your analysis code as a GPU shader than takes the "webcam video" as an input becomes very interesting.
 

Offline barry14

  • Regular Contributor
  • *
  • Posts: 102
  • Country: us
Re: The Daredevil Camera - seeing the world in sound
« Reply #29 on: June 30, 2016, 11:14:38 pm »
Before you go too far, you might want to research the following areas: underwater acoustic imaging and side-scan sonar.  Techniques to obtain high resolution images of underwater objects (including shipwrecks, cables and mines) using sound have been in existence for many years. They may provide you some insight for your project.
 

Online bitwelder

  • Super Contributor
  • ***
  • Posts: 1018
  • Country: fi
Re: The Daredevil Camera - seeing the world in sound
« Reply #30 on: July 01, 2016, 05:03:03 am »
Here's another application - monitor the audience at quiet concerts and point a STFU gobo-light at the inconsiderate gits who insist on talking through the music.
I'd like to see it at public libraries too (where it can double as 'art project')
 

Offline ArtlavTopic starter

  • Frequent Contributor
  • **
  • Posts: 750
  • Country: mon
    • Orbital Designs
Re: The Daredevil Camera - seeing the world in sound
« Reply #31 on: July 02, 2016, 12:01:43 pm »
USB3 - look at the FTDI FT60x devices.
Huh, haven't thought about these.
Thanks!

Yes but a little push wouldn't hurt
Perhaps, but in the end it made it there on it's own. :)
For some reason it feel bad to advertise or push my projects.

One tip with the SD cards - you may find that if you pre-erase every sector with FFs first before starting a recording
I tried that, doesn't do anything.

Could you overlay the image onto a light image? Like a thermal camera does.
Yep, that's a part of the plan.
It should even be better than thermal cameras, since i can stick the visible camera into the perfect center of the array.

I wonder if deaf people would find it useful?
I don't see what for, honestly.

Regarding the data rate to the host, to what extent could this be reduced by doing at least some initial process in the FPGA ?
Most of the FFT stuff, but that does not reduce the amount of data, just makes it easier for the PC to do the rest.
Forming the final image is a bit too complex for FPGA, and everything below that uses the same amount of data.

Did a quick sim, here's the PSF for a dead-ahead source with a 16x16 grid:
Looks about right.
I haven't considered a random grid approach, the aliasing is merely disregarded by limiting the FOV of the camera.
In practice, it does not seem to be a problem.

Here's another application - monitor the audience at quiet concerts and point a STFU gobo-light at the inconsiderate gits who insist on talking through the music.
Isn't that the same as the mosquito tracking, only with a weaker laser? :)

Uhm, probably I got something completely wrong, but why couldn't you do the same thing with a binaural microphone and some "magic mathematics stuff" to exactly resolve the sound sources based on phase shift, amplitude and blauerts frequency bands?
AFAIK, there is a difference between "resolve" and "image".
I'm not aware of any means to make an image with a higher resolution than the number of sensor positions.

I'm thinking that if you have a 44.1kHz sampling rate, and you divide the world in 1/60 second chunks, then in every chunk you have 44.1k / 60 = 735 samples, and 256 channels.
I'm sampling at 48KHz, in chunks of 128, 30 times a second.
This thing does not record sound as we know it, only as much as it needs to do the math.

Before you go too far, you might want to research the following areas: underwater acoustic imaging and side-scan sonar.
Perhaps, but the sonar is somewhat of a different beast than this.
 

Offline ArtlavTopic starter

  • Frequent Contributor
  • **
  • Posts: 750
  • Country: mon
    • Orbital Designs
Re: The Daredevil Camera - seeing the world in sound
« Reply #32 on: July 02, 2016, 12:05:11 pm »
Oh, and if anyone is interested, here is the sample of the raw data for the last 16x16 video of the article:
http://orbides.org/etc/snd_16x16_raw.tar.gz (4 Mb)

One file per cell, 16bit signed samples.
Stored row by row, 128 slices (at 48KHz) per frame (at 10 FPS).
Samples can overflow only once per sample step.

I'm hardly a signal processing expert, of if any of the gurus can get a better image than what i got, it should be interesting.
 

Offline VK5RC

  • Supporter
  • ****
  • Posts: 2673
  • Country: au
Re: The Daredevil Camera - seeing the world in sound
« Reply #33 on: July 02, 2016, 12:44:39 pm »
If  i recall correctly this issue was discussed briefly on one of the later AmpHour podcasts, the one with the CEO of Digilent, he gave this as an example of where parallel processing is vital as in a FPGA.
Whoah! Watch where that landed we might need it later.
 

Offline mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 14220
  • Country: gb
    • Mike's Electric Stuff
Re: The Daredevil Camera - seeing the world in sound
« Reply #34 on: July 02, 2016, 01:09:34 pm »
Quote
I wonder if deaf people would find it useful?
I don't see what for, honestly.
A smaller version might be useful as a virtual phased array to automatically identify and focus on one speaker in a noisy environment
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline electr_peter

  • Supporter
  • ****
  • Posts: 1458
  • Country: lt
Re: The Daredevil Camera - seeing the world in sound
« Reply #35 on: July 02, 2016, 05:19:24 pm »
Have a look at these videos https://www.youtube.com/user/sminstruments/videos
TThera are some examples of real video and sound direction merging.
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 10289
  • Country: gb
Re: The Daredevil Camera - seeing the world in sound
« Reply #36 on: July 03, 2016, 04:37:30 am »
Quote
I wonder if deaf people would find it useful?
I don't see what for, honestly.
A smaller version might be useful as a virtual phased array to automatically identify and focus on one speaker in a noisy environment
Using microphone arrays to pick out speakers at a conference has been the subject of considerable research. You can identify a speaker and accentuate their voice against the background without too much trouble. However, obtaining clean audio from an identified speaker is tough, and I haven't seen a really successful example.
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 10289
  • Country: gb
Re: The Daredevil Camera - seeing the world in sound
« Reply #37 on: July 03, 2016, 04:39:52 am »
Its very cool to see how a device like this has become practical as a personal project. The last time I worked in sonar, in the 80s, devices like this occupied a substantial team for a considerable period. Of course, they cost a fortune, too. :)
 

Offline radar_macgyver

  • Frequent Contributor
  • **
  • Posts: 772
  • Country: us
Re: The Daredevil Camera - seeing the world in sound
« Reply #38 on: July 03, 2016, 06:12:55 pm »
Cool project!

You might consider doing averaging in the FPGA (average together a number of FFTs) prior to sending to the host. If the number of averages can be controlled, you have the option of reducing the data rate to make it easier to work with. If what you're looking for is visualization, then you don't need a super high update rate anyway.

You could also consider doing an FFT in the audio sample domain (if I read your posts correctly, you are currently doing the FFTs only in the spatial X/Y domains). This lets you perform filtering (bandpass) in the FPGA and select just voice, or some other frequency band of interest. Do three different audio bands, then the spatial FFTs on each band. Drive the output to R, G and B (or better yet, H,S,V) to get a better visual effect. FPGA hardware can be time-multiplexed to handle the computations.

The idea of doing speaker identification, etc is interesting, but you'll end up having to fight reflections and sidelobe pickup. The former can only be solved with an anechoic chamber. The latter may be attacked with better beamforming techniques than a plain 2D FFT. Capon beamforming is a technique often used in phased array radar. You can also apply a 2D window function prior to doing the beamforming FFT, this reduces sidelobes but at the cost of resolution.
 

Offline ArtlavTopic starter

  • Frequent Contributor
  • **
  • Posts: 750
  • Country: mon
    • Orbital Designs
Re: The Daredevil Camera - seeing the world in sound
« Reply #39 on: July 06, 2016, 11:08:04 am »
A smaller version might be useful as a virtual phased array to automatically identify and focus on one speaker in a noisy environment
You don't have to be deaf for that to be really useful.

Hm, you probably must not be deaf at all to use it, unless there is a way to transform sound into something a deaf person can recognize.
I've read that the early voice coders were used to allow deaf people to talk over the phone essentially by printing out sound spectrum with some hints added.
Anyway, that's a completely different project.

this issue was discussed briefly on one of the later AmpHour podcasts, the one with the CEO of Digilent, he gave this as an example of where parallel processing is vital as in a FPGA.
Indeed.
I mentioned it in the article as well.

The last time I worked in sonar, in the 80s, devices like this occupied a substantial team for a considerable period. Of course, they cost a fortune, too. :)
Technology marches on. :)
For example, a quadrocopter was invented back in the late 20s. What made them impractical back then was the lack of a way to keep it stable.
Then, in early 2000s, chip-sized rotation sensors appeared, and all over the world enthusiasts started to put quadrotors together - all other parts were already available for decades.

Just a "what if" at the right time, while looking at the right part on the online catalog, and a drone is born.

You might consider doing averaging in the FPGA (average together a number of FFTs) prior to sending to the host. If the number of averages can be controlled, you have the option of reducing the data rate to make it easier to work with.
I don't see how averaging can help reduce the data rate - it would still send the same amount of data, just take longer to produce it.

If what you're looking for is visualization, then you don't need a super high update rate anyway.
30 frames per second, or it's no fun. :)

You could also consider doing an FFT in the audio sample domain (if I read your posts correctly, you are currently doing the FFTs only in the spatial X/Y domains).
I'm doing that already.
Without it, the spatial FFT won't work at all.
The colors in the videos correspond to the frequencies of the sound being seen.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf