Comma.ai is not a redundant setup.
Yes, sure. But it shows just how much is possible using a single processor and a simple camera. Putting 5 of those processors in a car is cheaper and probably better than designing a crazy AI ASIC.
I'm 100.00% sure whatever "ai" self-driving system with a single camera and mobile phone CPU will fail miserably to provide anything else than technological demonstrations.
There is a long, long way from a demo or even test drives, to a reliable and safe self-driving car. The difference between good conditions and bad conditions is easily 3-4 orders of magnitude in complexity, it's a massive showstopper for most startups on this field.
In more demanding conditions, even human brain with 90 billion neurons and 1000 megapixel stereo vision sometimes have severe difficulties processing where the road actually is. It needs a combination of very sophisticated sensing and very complex processing, including "AI-like" parts like neural networks, but also static algorithms.
I strongly suspect even Tesla, with their custom processing core and a dozen of cameras, will still be a long way from reaching reliable fully self-driving system. They are far from having "too much" processing power, likely still the opposite! I think going custom ASIC for processing is likely exactly the right direction. There will be practical limits in power consumption and parallelization, no one's going to put 5kW worth of server PCs or graphics cards in their cars then spend time synchronizing parallel processing. If Tesla can achieve the same processing power in a few hundred watts, they are years ahead on the processing hardware side. This "version 1" likely isn't what actually brings us fully autonomous cars, it could be version 2 or version 3; maybe it's in 2030's; but if no one else is solving the processing problem, then Tesla is likely the one who gets the fully working system first.
This said, processing hardware isn't everything. I also strongly suspect Tesla, at some point, will admit they need to throw some actual distance-measurement hardware, lidar, even radar, at the problem, even if they currently seem to give the impression they are only working with standard 2D multicameras. Stereo cameras can go a long way, but augmenting it with actual distance measurements will provide a lot of confidence in difficult corner case conditions, and for Tesla, developing a
low-cost solid-state LIDAR in co-operation with some innovative player in photonics field is not unlikely at all, even though if they don't publicly talk about it now.
For example, a stereo camera vision will fail miserably to provide any sense of 3D shape with fresh, smooth snow, in uniformly lit cloudy day. I have actually witnessed a human drive
directly in a 1.5 meter deep ditch, nearly at walking speed, simply because everything in our 2D vision is pure smooth white with no texture; human brain has hard time figuring out how to stereo image it, I'm sure Tesla will have hard time as well. OTOH, in similar conditions, a 3DTOF camera worth of $20 components does great job producing an accurate (within 10cm) point cloud from the whole site.
Of course, even a LIDAR wouldn't see what's
inside/below the snow... Which is again why processing/AI is so important. And you can't do it with a few hundred neurons. An ant cannot drive a car, nor can your dog. We need some human-level intelligence here, in worst cases.