SAE J3016 – summary of levels of driving automation .
The spurt in interest and development of Autonomous vehicles is a continuing boost to the growth of electronic devices in the automotive industry. The sensing, processing, activation, feedback and control functions done by the human brain have to be replaced with electronics. The task is proving to be exhilarating and daunting at the same time. The environment sensors – RADAR (RAdio Detection And Ranging), Camera and LIDAR (Light Detection And Ranging) are enjoying a lot attention with the need for increasingly greater range and resolution being demanded by the “eyes” and faster computation by the “brain”. Even though all three and more sensors (Ultrasonic / Stereo Camera / GPS / etc.) will be used together; this chapter will focus on challenges facing Camera and LIDAR. Anywhere from 2 – 8 cameras and 1 – 2 LIDAR are expected to be part of the sensor suite needed by Autonomous vehicles – which have to function equally well in day and night. Near infrared (800 – 1000nm) devices are currently emitters of choice in these sensors. Higher range, resolution and Field of view pose many challenges to overcome with new electronic device innovations before we realize the safety and other benefits of autonomous vehicles.
- autonomous vehicles
The Federal Automated Vehicles Policy  document released by NHTSA in September 2016 states that 35,092 people died on US roadways in 2015 and 94% of the crashes were attributed to human error. Highly automated vehicles (HAVs) have the potential to mitigate most of these crashes. They also have such advantages as not being emotional, not fatiguing like humans, learning from past mistakes of their own and other HAVs, being able to use complementary technologies like Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) – which could further enhance system performance. Add in the potential to save energy and reduce pollution (better fuel economy, ride sharing and electrification) – creating a huge impetus to implement autonomous vehicle technology as soon as possible.
On the other hand we have the consumer industry from Silicon Valley eyeing autonomous vehicles as a huge platform to engage, interact, customize and monetize the user experience. Think online shopping, watching a movie, doing your email or office work, video chats, customized advertisements based on user profile and location, etc. – all while our transport takes us to our destination. The innovation and business potential presented by the HAVs is only limited by imagination and savvy to overcome the challenges.
Among the various challenges to overcome are those of sensing the environment around and even inside the vehicle. Two of these sensing technologies are LIDAR and camera. Each of them are evolving fast to meet the industry demands. Levels 3–5 of autonomous vehicles as defined by NHTSA and SAE (Table 1) will need a high resolution and long range scanning LIDAR . They will also need cameras which operate in infrared (and visible) spectrum to be able to function at night and low light conditions.
|Level||Name||Narrative definition||Dynamic Driving Task (DDT)||DDT fallback||Operational Design Domain (ODD)|
|Sustained lateral and longitudinal vehicle motion control||Object and Event Detection and Response (OEDR)|
|Driver performs part or all of the Dynamic Driving Task (DDT)|
|0||No Driving Automation||The performance by the driver of the entire DDT, even when enhanced by active safety systems.||Driver||Driver||Driver||n/a|
|1||Driver Assistance||The sustained and ODD-specific execution by a driving automation system of either the lateral or the longitudinal vehicle motion control subtask of the DDT (but not both simultaneously) with the expectation that the driver performs the remainder of the DDT.||Driver and System||Driver||Driver||Limited|
|2||Partial Driving Automation||The sustained and ODD-specific execution by a driving automation system of both the lateral and longitudinal vehicle motion control subtasks of the DDT with the expectation that the driver completes the OEDR subtask and supervises the driving automation system.||System||Driver||Driver||Limited|
|Automated Driving System (“System”) performs the entire DDT (while engaged)|
|3||Conditional Driving Automation||The sustained and ODD-specific performance by an ADS of the entire DDT with the expectation that the DDT fallback-ready user is receptive to ADS-issued requests to intervene, as well as to DDT performance-relevant system failures in other vehicle systems, and will respond appropriately.||System||System||Fallback ready user (Driver is fallback)||Limited|
|4||High Driving Automation||The sustained and ODD-specific performance by an ADS of the entire DDT and DDT fallback without any expectation that a user will respond to a request to intervene.||System||System||System||Limited|
|5||Full Driving Automation||The sustained and unconditional (i.e., not ODD-specific) performance by an ADS of the entire DDT and DDT fallback without any expectation that a user will respond to a request to intervene.||System||System||System||Unlimited|
We will start with discussing the infrared spectrum, its advantages and disadvantages and then move onto LIDAR and Camera in some level of detail.
2. Infrared spectrum
2.1. Infrared radiation
The sun radiates electromagnetic energy in a wide spectrum from the shortest X-rays to radio waves. Figure 1 shows the portion visible to the human eye (~380–750 nm) and the infrared region . The near infrared region (~750–1400 nm) is used in many sensing applications including the night vision camera and LIDAR. The active night vision cameras (use light from artificial sources) are different from the passive thermal imaging cameras which operate at higher wavelengths (8–15 μm) and use natural heat as sources of radiation. Figure 1 also shows the wide range of infrared radiation from 750 nm to 1 mm wavelength.
Figure 2 shows the human eye and camera sensitivity to the visible – Near infrared (NIR) spectrum. The advantage and disadvantage for sensing applications primarily arises from the fact that infrared is mostly invisible in the far field. A fair amount of red color can be seen by most humans till 850 nm; beyond that lies a fair amount of subjectivity. The fact that the human eye is not very sensitive to NIR light allows cameras to be used unobtrusively (especially at night/poor lighting conditions). The disadvantage lies in the fact that silicon based image sensors have poor sensitivity in this wavelength (~35% QE at 850 and 10% at 940 nm). In addition these wavelengths can reach the retina of the eye – so the exposure has to be controlled to avoid damage.
2.3. Spectral irradiance
Dips in the spectral irradiance at surface are primarily due to water in the atmosphere. In the infrared spectrum of interest they occur at 810, 935, 1130, 1380, 1880 nm and beyond. This means the ambient noise is lower at these specific wavelengths. However, wavelengths of many semiconductor devices shift with temperature (~0.3 nm/°C for Gallium arsenide and aluminum gallium arsenide materials used in infrared spectrum); for automotive applications this shift is ~44 nm from −40 to 105°C. Ideally we need a peak with flat ambient noise variation around it for good design.
Another observation from Figure 3 is the lower ambient noise as we go to the longer wavelengths. However, past ~1000 nm the material base for detectors changes from silicon to germanium or indium gallium arsenide – which can be expensive.
3. Light Detection And Ranging (LIDAR)
3.1. Need for LIDAR in automotive
LIDAR, RAdio Detection And Ranging (RADAR) and Camera are the environment sensors central to the autonomous car operation. They are used to detect and classify the objects around the car by location and velocity. Each of the sensors has limitations and the information obtained from them is fused together with confidence prior to making a decision on the vehicles trajectory.
Table 2 provides a brief summary of the above sensing technologies.
|Sensor||Typical range||Horizontal FOV||Vertical FOV||2020 price range||Comments|
|24 GHz RADAR||60 m ||56° ||~±20°||<$100||USA Bandwidth Max 250 MHz Robust to snow/rain|
Poor angular resolution; sensitive to installation tolerances and materials
|77 GHz RADAR||200 m ||18° ||~±5°||<$100||Similar to 24 GHz RADAR with more bandwidth (600 MHz ); sensitive to installation tolerances and materials|
|Front Mono Camera||50 m ||36° ||~±14°||<$100||Versatile sensor with high resolution; Poor depth perception; High processing needs; low range; sensitive to dirt/obstruction|
|LIDAR (Flash)||75 m||140°||~±5°||<$100||Better resolution than RADAR and more range than Camera. Eye safety limits; Poor in bad weather; sensitive to dirt/obstruction|
|LIDAR (Scanning)||200 m||360°||~±14°||<$500||Similar to Flash LIDAR with higher resolution and Cost; sensitive to dirt/obstruction|
LIDAR sensors could be classified on any of its various key parameters:
Operating principle: Time of Flight (ToF)/Frequency Modulated Continuous Wave (FMCW)
Scanning technology: Mechanical/Micro-Mechanical-Mirror (MEMS)/Optical Phase Array (OPA)
Wavelength: 905 /1550 nm
Detection technology: Photodiode/Avalanche Photodiode/Single Photon Multiplier
…and many other ways
3.3. Time of Flight LIDAR Operating Principle
The Time of Flight LIDAR operation can be explained using Figure 4.
A laser is used to illuminate or “FLASH” the field of view to be sensed. The laser pulse travels till it is reflected off a target and returned to a detector. The time taken for the pulse to travel back and forth provides the range. The location of the target is based off optics mapped over the field of view and detector array. Two or more pulses from the target provide the velocity. The angular resolution depends on the number of detector pixels which map the field of view. The more pixels we have – the better the resolution.
The same principle is used by 3D cameras or high resolution flash LIDAR. Higher power and more detector pixels are used.
3.4. Emitter and detector options
As shown in Figure 4, to increase the range by 2× – the needed power is 4×. As we increase the power – we start running into eye safety limits. Infrared light below 1400 nm can reach the retina of the eye. If the exposure limit is exceeded, permanent eye damage can occur.
There are many levers available to achieve the needed range – including better detectors, bigger lenses, and shorter pulse widths. Of course, the best option would be to use light above the 1400 nm wavelength. However, to use lasers and detectors in this wavelength region (>1400 nm) – we typically have to use more expensive materials (indium-gallium-arsenide—phosphide lasers and germanium-based detectors).
3.5. Eye safety
Sunlight on the earth’s surface is composed of ~52% infrared (>700 nm), ~43% visible (400–700 nm) and ~3% Ultraviolet (<400 nm) . The intensity of infrared is low enough that it does not cause eye damage under normal exposure. When light is visible and bright, the eye has a natural blink response and we do not stare at it – helping to avoid eye damage. Infrared light is not visible and so can cause eye damage if exposure limits are not regulated.
The safe levels of infrared levels are regulated by IEC-62471 for Light Emitting Diodes and IEC-60825 (2014) for lasers. In USA, the equivalent federal standards are in 21 CFR 1040 (Code of Federal Regulations).
The standards have hazard exposure limits for the cornea of the eye, thermal hazard limit for skin and eye retinal thermal hazard exposure. For exposures above 1000 s, the irradiance limit is 100 W/m2 at room temperature and 400 W/m2 at 0°C. The retina exposure limits tend to be more stringent. The calculations are complex and depend on wavelength, size of the emitter, exposure time and other factors.
3.6. Signal processing challenges
As sensors demand higher resolution and faster response – it increase the computational needs. At the raw signal level, using the forward camera as an example:
Number of pixels to be processed = frames per seconds × horizontal field of view/resolution × vertical field of view/resolution.
Example: 30 fps camera, 40° HFOV, 40° VFOV, 0.1° resolution
30 × 400 × 400 = 4.8 Mpx/s
A similar amount of data needs to be processed by the LIDAR, RADAR and other sensors. At some level, this information has to be fused to recognize and classify objects and their trajectory.
As more and more sensing data is collected, processed and acted upon in real time (time between collection and use is extremely short), creating ways of storing/processing and updating data are being developed. For example – the 3 dimensional roadway maps needed for autonomous driving are stored in the cloud (remote server) and real time data is processed to look only for changes and updates; thus reducing the amount of data crunching to be done in real time. Another trend is to collect and process the raw analog signal when possible – thus reducing the downstream processing needs.
Security of data in autonomous vehicles is another growing concern and business opportunity for innovation. Automotive Information Sharing and Analysis Center (Auto-ISAC) (
Camera’s in automobiles continue to grow as their functional versatility is exploited with increasing innovation. They have become central to Advanced Driver Assistance Systems (ADAS) like adaptive cruise control, adaptive high beam, automatic emergency braking, lane departure warning, blind spot detection, driver monitoring, traffic sign detection and others.
The latest Tesla Model 3 is believed to have up to eight exterior cameras. Other OEM’s are also using interior driver monitoring and gesture recognition cameras. A presentation from IHS Markit  shows typically five exterior and one interior camera for Level 3; eight exterior cameras and 1 interior camera for Level 4 being planned by a number of Original Equipment Manufacturers.
4.1. Exterior infrared camera (night vision)
Cameras need light to illuminate the objects in its field of view. Currently most cameras used in ADAS functions work with visible light – which is fine for daytime operation. However, at night the prime source for visible light is usually the headlamps of the car. The visible light from the headlamps is strictly regulated by NHTSA with its Federal Motor Vehicles Safety Standard 108 (FMVSS 108). Figure 5 below shows a bird’s eye view of the permitted illumination region in the USA.
It can be observed that in essence, visible light can only be legally used for a limited range of ~60 m in front of the vehicle. Illumination outside the car lane and around the car is very limited (if any). These legal requirements are not expected to be changed anytime soon – since we will have cars driven by humans for at least another 20–30 years. This means to illuminate to longer and wider fields of view, the cameras have to work with infrared light (which is not regulated by FMVSS 108). As long as the infrared light is within eye safe limits, it can be used all around the car.
Figure 6 shows a graphic overview of the regions around the car that are covered by cameras. The forward camera needs to ideally sense as far as the RADAR and LIDAR to permit good sensor fusion.
The target range for RADAR and LIDAR is at least 200 m (Forward direction) and 50–100 m in all other directions.
4.2. Exterior camera illumination challenges
The spectral sensitivity of CMOS image sensors at 850 nm is ~35% compared to its peak at 550 nm (green). Further down at 940 nm, this reduces to ~10%. This means a larger number of infrared photons is needed to generate a clear image.
To illuminate targets at longer ranges and wider field of view more light is needed. In addition, different targets have different reflectivity – which can have a significant effect on the image quality. So while we put out more and more light to get a better signal – we need to ensure the intensity is still eye safe. We also start eating up more energy from the battery for illumination. Calculations show the amount of infrared flux needed could be anywhere from 6 W (100 m range, 12° FOV, 50% reflectivity, 850 nm, 0.15 μW/cm2, Lens F#1) to 1250 W (200 m range, 40° FOV, 10% reflectivity, 850 nm, 0.15 μW/cm2, Lens F#1) [10, 11].
A typical headlamp today may have 5 W of visible light used per lamp currently. Imagine the complexity of adding 100’s of more Watts to the headlamp. The self-driving eco system has not yet come to grasp the scope of challenge that it has to deal with here. The alternative would be to rely more on the LIDAR and RADAR sensors at the longer ranges and use the camera only in the short ranges. This option may not provide needed reliability – since all of these technologies have weakness (RADAR does not same resolution as camera at long ranges and LIDAR is more prone to poor performance in bad weather).
Potential solution options which have not been fully vetted are to use pulsed infrared lasers to illuminate the CMOS based cameras; use of infrared matrix lighting architectures where rows of LED’s are turned on in sequence with a rolling shutter camera more to come as we make progress.
4.3. Interior camera – market need
The need for an interior camera arises out of multiple market forces. The first is the introduction of self-driving cars which are autonomous only in certain driving conditions (highways/traffic Jams). The cars switch between the human driver and the computer as needed. To do this effectively, the human driver has to be monitored as part of the environment in and around the car. This is to ensure adequate warning is given to the driver to leave their current engagement and get ready to take over the task of driving.
The second market force is the increase of distracted driving. In 2014, 3179 (10% of Total) people were killed and an additional 431,000 (18% of total) were injured in collisions involving distracted drivers in the USA . NHTSA has a blueprint to reduce accidents related to distracted driving – which encourages OEM’s to put in place measures to ensure the driver keeps their eyes on the road when the vehicle is moving. A definition of distraction in terms of driver gaze and time elapsed away from looking straight is provided in other-related NHTSA documents . At a high level, looking more than 2 s in a direction 30° sideways of up-down when the vehicle speed is more than 5 mph would be classified as distracted. The increase in distracted driving is attributed to cell phone/smartphone/texting and related activities.
Additional benefits and applications are continuing to generate from the driver monitoring infrared camera system. It lends itself well to also catch drowsy drivers (eyelids shut or drowsy pupils); face recognition – not strong enough to be a biometric device, but enough to at least enable customized settings for different drivers in family and many more to come.
The auto industry is responding to these two needs (autonomous cars, distracted driving) by installing an infrared camera to monitor the gaze of the driver. Infrared illumination is needed – since we do not want to distract the driver at night with visible light. The wavelength for illumination is in the 850–950 nm range. The eye safety and camera sensitivity challenges of illumination in this spectrum were briefly discussed earlier sections. A few other challenges are discussed in the next section.
4.4. Interior camera illumination challenges
When we use an infrared camera facing the driver, the LED’s are shining the light right on our eyes and face. Light at 850 nm can be red enough to be seen easily by most people – especially at night. Measures to put in a dark filter and smudge the bright red LED spot with optics are partially successful. The problem arises from the fact that anything done to reduce the brightness will usually also reduce the illumination – which would result in poor image quality and failure to detect distraction in gaze by the software processing the image.
One solution is to go to higher wavelengths (940 nm) – the challenge here is lower camera sensitivity. This has been overcome by pulsing higher peak currents at lower duty cycle using a global shutter image sensor. The typical cameras used are 30 fps and these are fast enough – since gaze while driving does not change that often and fast.
On the eye safety side, measures are needed to ensure that when the eyes are too close to the Infrared LED (IRED) – then they either need to be shutoff or reduced in intensity. Typically the distance to the eye is estimated with the camera itself, as an added measure we can have proximity sensors.
Since these cameras work in infrared with a filter block for visible wavelengths, the biggest challenge for illumination tends to be during daytime under full sunlight. The IREDs have to typically overcome ambient noise from the sun. Polaroid sunglasses can also sometimes prevent function if the coating prevents the wavelength to pass through.
The last challenge worth mentioning is that of consumer acceptance and loss of privacy. From a legal perspective, if the camera is recording the driver’s face – the information can be pulled up in court if needed by a lawyer. NHTSA regulations mandate that any information needed for vehicle safety has to be stored for a short direction – essentially a black box (As used in aircrafts) to help reconstruct an accident. Will consumers trade a loss of privacy for safety and convenience (of automated driving) is yet to be seen. OEM’s may initially provide consumers with the option to turn off the camera (and related loss of function) to enable the transition.
4.5. Additional applications for interior camera
OEMs are evaluating the concept of using interior cameras to monitor all occupants in the car – to enable optimum deployment of airbags and other passive safety devices. At a basic level, if there is no occupant in the passenger seat (or just a cargo box) – do not deploy the airbag.
Another application is the use of gesture recognition. The idea is use gesture’s seamlessly and conveniently to open windows/sunroofs/turn on radio/change albums/etc. The successful combination of voice, touch and gesture to operate devices depend a lot on the age group (and resultant car design) and how well the technologies are implemented.
Face recognition and iris recognition are already making their way into smartphones. They are expected to penetrate the auto market. Even through the technologies are available and mature, the business case/consumer demand/willingness to pay for these functions is yet to be explored.
4.6. Signal processing
As cameras become ubiquitous around the car, the questions become how many cameras are enough and what should be the range and resolution of the cameras. The same question can be asked of LIDAR and RADAR also. However, signal processing tends to be more demanding the high resolution (comparatively) of cameras.
Assuming a VGA format for the image sensor, we get 480 (H) × 640 (W) pixels per frame; with typically 30 fps coming in for processing. The resolution we get from this VGA image sensor depends on the optical field of view it covers and the maximum range at which the smallest object has to be recognized and resolved for action. At 100 m and a 40° HFOV the width covered by the 640 pixels is ~7279 cm. This means each pixel covers 11.4 cm or ~4.5 in. Is this level of resolution good enough for self-driving cars? The next section digs a little deeper into this topic.
4.7. Exterior camera resolution requirement
What is the smallest object that can change the trajectory of the car? One could argue this could be as small as a nail or sharp object on the road. Maybe with the newer tires which can roll over nails, we can overlook this object (They would then become mandatory for self-driving cars). The next object I can think of would be a solid brick placed on the road which even though small, could change the trajectory of the car. Other such objects like tires, tin cans, potholes, etc. could be imagined that would have a similar impact.
The autonomous car machine vision has to detect such an object at a far enough distance to take appropriate measures (steer, brake/slow down or prepare for collision). With a speed of 100 mph and a dry road with friction of 0.7, a braking/sensing range of 190 m is calculated . A modular USA brick with dimensions of 194 × 92 × 57 mm would subtend an angle of ~2 arc min (tan−1 65/100,000). This level of resolution would be outside the capability of a standard VGA camera.
After detection, the object has to be classified before an action can be taken on how to deal with it. The kinds of objects the car could come across on its path depends very much geo fenced location. Objects on the US road freeways and urban streets could be very different from those in India or china. This is the point where the admiration for the human senses and brain capacity start daunting current computer chips.
5. Sensor fusion
5.1. Need for sensor fusion
For self-driving cars to be accepted by society, they would have to demonstrate significantly lower probability of collision – when compared to human drivers. A 2016 study by Virginia Tech Transportation Institute  found that self-driving cars would be a comparable or a little better than humans for severe crashes, but significantly better at avoiding low severity level crashes (level 3). The level 3 crash rate was calculated at 14.4 crashes per million miles driven for humans and 5.6 crashes for self-driving cars.
To keep things in perspective, we could estimate an average person in USA to drive 900,000 miles in their lifetime (12,000 miles/year × 75 years). Also note that the above report uses only Google self-driving car data. These cars are known to have a full suite of sensors (Multiple LIDAR, RADAR, Cameras, Ultrasonic, GPS and other sensors).
The point is that just like the human driver, the car has to integrate the information from multiple sensors and make the best decision possible in the circumstance. On top of that, it has to be way better to get people to start adopting the technology. Knowing that each of the sensor technologies has some limitation, the need to fuse multiple inputs reliably is a daunting task. Incorrect or poor implementation of the sensor fusion could quickly take the car back to the dealer show room.
5.2. Challenges to sensor fusion
Figure 7 below illustrates the challenge of sensor fusion.
The objective sensor fusion is to determine the environment around the vehicle trajectory with enough resolution, confidence and latency to navigate the vehicle safely.
Figure 7 row 1 shows the ideal case when two sensors agree on an object and the object is detected early enough to navigate the car.
Figure 7 row 2 shows a case where each of the sensors classifies the object differently. In this case, the best option maybe to just agree that it is big enough object to avoid if possible.
Figure 7 row 3 similar situation where a person on a bicycle maybe identified as a person or a bicycle. Again, we could agree that it is an unidentified large moving object that needs to be avoided.
The last two rows shows smaller objects that pose difficult questions. Is it better to run over a small dog than to risk braking and getting rear-ended? Can the pothole be detected and classified early enough to navigate? Is the pothole or object small enough to run over?
These questions will take a longer time to resolve with improving technology in sensing, computing, public acceptance and legislation. The 80/20 Pareto principle would imply that the last 20% of the problems for self-driving cars will take 80% of the time it takes to bring it to mass market.
The exponential growth of electronics in the auto industry can be estimated by the number of sensors and electronic control units (ECUs) being added to each newer cars. From a 2003 VW golf (~35 ECUs, 30 sensors); a 2013 Ford Fusion (~70 ECUs, 75 Sensors) to a projection for automated car in 2030 (~120 ECUs, >100 Sensors) . One could be forgiven for imagining the future car to be a supercomputer with wheels.
We are in the initial growth spurt for autonomous cars. A lot of technology still remains to be innovated and matured before regulation and standards kick-in. LIDAR technology is still evolving – range, resolution, eye safety, form factor and cost of the technology is improving rapidly. Camera hardware for medium range and VGA resolution has matured – but needs improvement in range (200 m target), resolution (>8 Megapixel) and performance under poor lighting or with infrared. Sensor fusion architectures can only be optimized after sensors needed are standardized or at least well understood. Real time operation with use of Artificial Intelligence – Neural networks is still in early stage. Society has still to debate and accept the safety performance with known behavior of these robots on wheels. What a great time for electronics and the Auto industry!