Previously in this space we examined the roles of Radar and LIDAR (Light Detection and Ranging) in developing self-driving vehicles. Now we will look at the third member of the Sensor Trifecta, machine vision.
Providing the ability to recognize an object (say, a pedestrian) or a pattern from a camera image involves a form of artificial intelligence called Deep Learning. The basic concept of Deep Learning is to use neural networks to examine correct answers from hundreds or thousands of examples, then use that experience to create an algorithm to solve similar problems in new situations (in the case of autonomous driving, whether to turn left or right, accelerate or brake).
Similarly, via Deep Learning another algorithm can be developed to break down the image of a stop sign into its essential parts: the shape of the sign (octagon), the color (red), and the word “Stop” so as to correctly interpret the information presented. In both cases a mapping between features and actions is established during training.
To give you an idea of the complexity of the Deep Learning task for Level 4 or 5 autonomous vehicles, consider the Imagenet Large Scale Visual Recognition Challenge (ILSVRC), a benchmark in object category location (for 1,000 categories) and detection (for 200 fully labeled categories) using millions of images. A Top-5 error rate—the fraction of test images for which the correct label is not among the five labels considered most probable by the model—is on the order of 10%.
So, given the difficulty of correctly interpreting camera images you might now be wondering which sensor technology among Radar, Laser, or Machine Vision will win out. The answer is all three. Use of all three sensor technologies is needed to give the autonomous car the redundancy to: 1) develop an accurate map of its immediate surroundings; 2) provide the necessary situational awareness without error; and 3) determine a safe, drivable path, maneuver in traffic, and avoid collisions.
Take a second to let that settle in. Okay, let’s move on
Adding to the difficulty of coming up with a successful Deep Learning algorithm for self-driving cars is the real-world fact that driving with and without other cars on the road are two totally different problems; the former has to account for unpredictable motion. Throw in the need to develop “policy,” i.e., what to do at a four-way stop, or when turning right on red, and you can quickly understand how this complicates an already challenging machine learning task. Instinct suggests a lot of work remains to be done and indeed policy may be the Achilles heel of machine vision algorithms because of the unpredictable nature of how human drivers behave.
Is there another way? Perhaps. Mobileye, a company based in Israel that develops cameras, hardware, and software for the auto industry thinks so. Recently acquired by Intel for $15.3 billion, MobilEye is pinning its hopes on a system that employs eight cameras spaced around the vehicle, along with processing chips. Key to the MobileEye effort, starting in 2018, is crowdsourcing data to produce high-definition maps, as well as the driving policy intelligence underlying driving decisions.
Current 3D mapping of is done via laser scanning with a fleet of cars traveling on U.S. roads. But that is a very time- and manpower-intensive operation. And the U.S. road network has 4 million miles of road, so even with a large fleet of cars this could take quite a while. MobilEye calls its answer Road Experience Management (REM), which enables crowd-sourced real-time data for precise localization and high-definition lane data to support fully autonomous driving.
Key to the MobileEye effort, starting in 2018, is crowdsourcing data to produce high-definition maps, as well as the driving policy intelligence underlying driving decisions.
Whereas a normal digital map would show roads, intersections and geophysical landmarks; these maps would have a precision measured in centimeters.
Here’s how: With millions of front-facing cameras about to be installed in vehicles the idea is to use these cameras with some built-in AI to locate key road markers—lanes, signposts and other objects—and send the results up to the cloud, analyze it there and send back updated maps about the road just traveled.
According to MobilEye it takes just nine vehicles traveling on a given road to obtain the high-definition map accuracy needed (10cm). MobilEye claims the data transmission rates to the cloud will not be as high as you might imagine: 10kB per km of road, or 1mB for 100 km. The camera’s AI will parse out unimportant data keeping the transmission rate manageable.
Transforming the imagery captured into useful data will require a huge amount of computing power. MobilEye’s current chip, EyeQ4, will be ready for volume production in 2018 and is aimed at supporting Level 3 automation, the ability of a car to drive itself with only occasional human intervention. For fully autonomous vehicles being prepared for launch in the 2020-21 time frame the next version of Mobileye’s computer vision chip, known as EyeQ5, will need to perform 12-15 Tera (trillion) operations per second to process the visual data, while maintaining power consumption of 5W or less, according to the company.
A technology-specific standard, IEEE P2020 (Project 2020) for vehicle camera image quality and communications protocols is under development. The project aims to specify methods and metrics for measuring and testing automotive image quality to ensure consistency and create cross-industry reference points. A major goal is to define a standardized suite of objective and subjective test methods for measuring automotive camera image quality, communications and comparison for OEM and Tier 1 system integrators and component vendors.
IEEE’s Standards Association Working Group on Automotive System Image Quality reports strong support from participants including Daimler, Ford, GM, Intel, Jaguar Land Rover, LG, ON Semiconductor, OmniVision, Panasonic, PSA, Robert Bosch, Samsung, Sony, Valeo, and many more.