I'm unsure if your problem is how to record 12 microphones or how to locate a sound.
I did a sound locating system once using a PC soundcard with 2 stereo inputs = 4 channels = 4 mono microphones. My system didn't calculate elevation as the 4 microphones were on the same plane.
Put the mikes in pairs and a known distance apart and put Card 1's mikes at a known angle to Card 2's such as 90 degrees .. ie a cross/box shape.
The delay offset for Card 1 Channel A vs Channel B will give two possible bearings on that plane - not elevation.
The delay offset for Card 2 Channel A vs Channel B will give two possible bearings on that plane - not elevation.
Card 1 and 2 might have a small bit of latency in comparison to each but you can analyze the waveform to determine that and process as required.
Cross-reference the two directions and there's your source bearing - add more microphones for more resolution and elevation by working in another plane.
https://en.wikipedia.org/wiki/Boomerang_(countermeasure)
This was the inspiration to make my own - I got bored after the 4 mikes and never did the elevation pair and I certainly never wrote the same level of algorithms to process masses of background noise. But mine worked fine at the local range with little background noise :p
Sound at sea level is about 1,129ft/344m per second so assuming 44.1kHz input you get a theoretical resolution of 0.02ft (0.3inches) or 0.0078m (7.8mm) - mine was good, but not that good :p
It's easier for sharp sounds with give a really clear peak to match such as the crack of a bullet - bit harder for conversation with less clear peaks.
Sorry, its a software solution and not an EE/hardware one.
Regarding 12 inputs: there are audio/studio sound cards with 4+ input analogue channels and you can have multiple soundcards per PC - You can get also special boards with 6+ PCI slots.
When working for a radio station one project was to beat match the current track with the next track when the system was in automatic mode then fade out and fade in the tracks.
Problem was it worked so accurately we got complaints the audio sounded mechanical so I had to add in random delays :p.
Which shows that once you've captured a waveform everything is possible!
Matching 12 waveforms ... Woo hoo! Good luck
.. Let me know if I can help with the software - others definitely have more hardware knowledge than me
If the military can do it with 7 do you need 12? - though you can possibly discard a few that you can't match to anything.