It also failed to detect the human on the way. Which is obviously "OK", it isn't supposed to be a self-driving car. So blame the driver, as usual.
But, that horse has been beaten to death. Limiting the discussion on the technical side only, with the assumption that Tesla's target is to make a truly autonomous mode and eventually get rid of that "you are the one driving the car, don't let your hands of the wheel" disclaimer:
This is a good demonstration how miserably camera-based systems can fail. Improving cameras and image processing reduces the number of such cases, but some, even fairly common, edge cases always remain. It is technically impossible detect a color X object against color X background, while for a laser rangefinder, the color of the objects is irrelevant as long as it has enough transmittance.
You need time-of-flight of any kind to assist in such trivial cases. Even something fairly "cheap". Combining standard camera imaging with time-of-flight ranging brings the best of both worlds, but Tesla seems to really believe in using vision only. Their stance can be explained by saying that human drivers are limited in vision only, as well, so it must be possible to replicate that; but on the other hand, even if they were able to replicate the image processing human brain can do, people drive a lot of accidents due to not noticing something, so it's not a good performance target!
I thought about this when I was developing a 3D time-of-flight based mapping system which cleanly maps the existence of all floors regardless of surface materials. While doing that, I was linked Tesla PR material, supposedly showing the excellence of their 3D point clouds based on camera data. These point clouds looked great on quick glance, but did not show the streets or other even surfaces at all, only some of the objects, such as edges of buildings, vehicles, street markings... Granted, what they did show was breathtaking, with good accuracy and resolution, and a lot of points, but really the key when it comes to obstacle-avoidance (being the whole point of autonomous driving) is to understand what remains unseen. You can't afford to detect 99% of the objects, you need to detect 100%.
And for safe autonomous maneuvering, you need to both verify the surface to drive on is there - it can have a certain amount of acceptable slope -, and to verify there are no objects above that surface.
I truly believe that Tesla should allow the idea of a hybrid sensor system, adding sensing capabilities "beyond human senses", for example, distance sensing; instead of believing they can meet and even exceed the human performance by using the human senses (vision) alone. Why? Because with modern electronics, you can actually sense things human cannot, so why not take advantage of this technology? The more you can sense, the better chances you have to fuse all the sensor data together to form a complete picture; possibly with simpler algorithms. Working on a limited dataset requires really good algorithms and still misses something.
I'm sure they are working on this behind the scenes, though.