USB3 - look at the FTDI FT60x devices.
Huh, haven't thought about these.
Thanks!
Yes but a little push wouldn't hurt
Perhaps, but in the end it made it there on it's own.

For some reason it feel bad to advertise or push my projects.
One tip with the SD cards - you may find that if you pre-erase every sector with FFs first before starting a recording
I tried that, doesn't do anything.
Could you overlay the image onto a light image? Like a thermal camera does.
Yep, that's a part of the plan.
It should even be better than thermal cameras, since i can stick the visible camera into the perfect center of the array.
I wonder if deaf people would find it useful?
I don't see what for, honestly.
Regarding the data rate to the host, to what extent could this be reduced by doing at least some initial process in the FPGA ?
Most of the FFT stuff, but that does not reduce the amount of data, just makes it easier for the PC to do the rest.
Forming the final image is a bit too complex for FPGA, and everything below that uses the same amount of data.
Did a quick sim, here's the PSF for a dead-ahead source with a 16x16 grid:
Looks about right.
I haven't considered a random grid approach, the aliasing is merely disregarded by limiting the FOV of the camera.
In practice, it does not seem to be a problem.
Here's another application - monitor the audience at quiet concerts and point a STFU gobo-light at the inconsiderate gits who insist on talking through the music.
Isn't that the same as the mosquito tracking, only with a weaker laser?

Uhm, probably I got something completely wrong, but why couldn't you do the same thing with a binaural microphone and some "magic mathematics stuff" to exactly resolve the sound sources based on phase shift, amplitude and blauerts frequency bands?
AFAIK, there is a difference between "resolve" and "image".
I'm not aware of any means to make an image with a higher resolution than the number of sensor positions.
I'm thinking that if you have a 44.1kHz sampling rate, and you divide the world in 1/60 second chunks, then in every chunk you have 44.1k / 60 = 735 samples, and 256 channels.
I'm sampling at 48KHz, in chunks of 128, 30 times a second.
This thing does not record sound as we know it, only as much as it needs to do the math.
Before you go too far, you might want to research the following areas: underwater acoustic imaging and side-scan sonar.
Perhaps, but the sonar is somewhat of a different beast than this.