Future perfect autonomous cars

Shaan
11 min readOct 8, 2020

We all watched Doctor Strange movie by Marvel and the famous car crash scene. A serious lapse in Dr. Stephen Vincent Strange judgment, wilfully distracting himself while driving, took him down the hill crashing. The car in the scene is a 10-cylinder, $237,250, Lamborghini Huracán LP610. Setting time travel theory aside, could Artificial Intelligence have foreseen this harrowing crash before it happens. Could it have detected the car over speeding, moving out of the lane, and applied breaks. Or even better could it have let the doctor review his scans while the car takes over driving. Well in theory ‘Yes’. But…

The Lamborghini Huracán being filmed in Doctor Strange. Photographer: Allan Portilho on behance

We are reaching the end of the year 2020 and it appears that our driverless cars are receding into the future?

From da Vinci’s Self-Propelled Cart in 1500 centuries — to — John McCarthy’s Dartmouth Conference, at which the term ‘Artificial Intelligence’ was first adopted in 1956 — to — Tesla’s first test drive of an autonomous car in 2015, we have come a long way in auto tech advancements. The last decade witnessed big advances in AI, computer vision and object recognition, and speech generation, which gave us an optimistic 2020 dream of cursing in a hands-free car.

An enormous sum of USD 16 billion was spent until 2019 by large auto tech players, including Waymo, Tesla, GM, Uber, Baidu, Ford, and Toyota, in their mobility automation race. By 2025, $85 billion spend in estimated, increasing up to $225 billion by 2023.

Despite these extraordinary efforts and substantial expenditure, full automation in our vehicles is still a distinct reality, except in special trial programs. Even Gartner, a research and advisory firm, has now placed ‘autonomous vehicles’ in the trough of disillusionment of their yearly Hype Cycle.

So how complex is it to build a fully driverless car? Let’s understand its technological facets –

Self-driving vehicles turned out to be a much more difficult engineering challenge than initially anticipated by the experts. It requires a complex interaction of numerous digital and physical systems that can emulate human behaviors like react to weather conditions, and make judgment calls on vehicle and pedestrians right of way, etc.

In the subsequent explanation, we have elided the technical complexity to provide a simple explanation of various elements of autonomous vehicles.

Varying degrees of automation in our vehicles are represented as Levels (L0-L5). L0 stand for no automation; L1 — for Advanced Driver Assistance Systems (ADAS) controlling steering or speed to support the driver; L2 — for autonomously controlling both steering and acceleration simultaneously; L3- for Conditional automation, where the system can drive without the need for a human to monitor and respond; L4 — for high automation systems which can fully drive themselves under certain conditions; L5 — for full automation, having same mobility as a human driver.

L1and L2 automation cars are already available in mid-luxury segment cars and reportedly their sales are soaring. Toyota, Tesla, Nissan, Ford, and BMW are taking the lead in the number of cars sold in 2019. L3 is mostly at the regulatory clearance stage, Mercedes and BMW are expected to launch their L3 autonomous capability cars by 2021. While the companies are currently focusing on the L4 style of architecture with central control, L5 automation is in troubled waters due to various setbacks.

L4 & L5 automation successes are essentially dependent on the evolution of AI and sensor technology. Maturity of technologies like voice search, voice and speech recognition, motion detection, image recognition and processing, and data analysis are the building blocks for the vehicle to act independently without human intervention. These technologies enable our cars to perceive the environment, to process inputs and decide the vehicle’s path, and to act upon decisions all by itself.

Gathering 360° input data is the first step. Multiple sensors are deployed to do just that

As an autonomous vehicle operates in a dynamic environment (roads, terrains), it needs to build a map of this environment and localize itself within the map. The input to perform this Simultaneous Localisation and Mapping (SLAM) process needs to come from sensors and pre-existing maps created by AI systems and humans.

Both active and passive sensors such as RADAR, LIDAR, thermal cameras, and digital cameras, GNSS, Ultrasound are deployed for the task. Thermal and digital cameras use CCD (charge-coupled device) or CMOS (complementary metal-oxide-semiconductor) image sensors that capture and change the signal received in wavelengths (visible to near-infrared spectra) to an electric signal. They are useful for the detection of hot bodies, such as pedestrians or animals, for gathering visual field information, and for peak illumination situations such as the end of a tunnel, etc.

Active sensors such RADAR (Radio Detection And Ranging), LIDAR (Light Detection And Ranging), Ultrasound, have a signal transmission source and rely on the principle of time-of-flight (ToF) to sense the environment. ToF measures the travel time of a signal from its source to a target, by waiting for the reflection of the signal to return. Most of these sensors are adversely affected by their limitations, for instance, environmental conditions like rain and dust, signal range, and other signal interference in the field. LIDAR is the most commonly used active sensor as it provides a 3D map of up to 250m in range by detecting objects and their movement with fewer limitations. New LIDAR sensors can enable the vehicle to see objects 150–250 m away.

LiDAR provides a 3D point cloud of the environment, image source: robotcar-dataset.robots.ox.ac.uk

Vehicle manufacturers use a mixture of cameras and ToF sensors strategically located to overcome the shortcomings of the specific technology e.g. Tesla’s Model S uses a forward-mounted radar to sense the road, 3 forward-facing cameras to identify road signs, lanes and objects, and 12 ultrasonic sensors to detect nearby obstacles around the car.

Volvo-Uber uses a top-mounted 360-degree Lidar to detect road objects, short and long-range optical cameras to identify road signals, and radar to sense close by obstacles.

Waymo uses a 360 degree LIDAR to detect road objects, 9 visual cameras to track the road, and radar for obstacle identification near the car.

images: from Tesla, Volvo, Waymo, by Wevolver

Geo-mapping and sensor fusion for instating awareness in AV

Once the autonomous vehicle has scanned its environment, it can find its location on the road relative to other objects around it. This information is critical for lower-level path planning to avoid any collisions with objects in the vehicle’s immediate vicinity. In addition, geographical location, which translates to a latitude and longitude will also be required by the vehicle to know its relative local and global position on Earth in order to be able to determine a drive path.

Map services such as Google Maps are widely used for navigation, but HD maps may be required to increase the spatial and contextual awareness of autonomous vehicles. Other methods are also being explored to achieve the task such as Apple’s autonomous navigation system and Tesla’s high precision lane line maps. Wayve only uses standard sat-nav and cameras. While, MIT took a ‘map-less’ approach and used LIDAR sensors for all aspects of navigation, and only relying on GPS for a rough location estimate.

Based on all the raw data captured by the vehicle’s sensor and the pre-existing maps, the automated driving system uses Simultaneous localization and mapping (SLAM) algorithms to construct and update a map of its environment while keeping track of its location in it. To improve SLAM accuracy, sensor fusion comes in handy. Sensor fusion is the process of combining data from multiple sensors and databases to achieve improved information. Once its location on its map is known, the system can start planning which path to take to get from point A to point B.

Processing data using Machine Learning

A complex End-to-End solution based on deep learning algorithms handle all processing and decision making required to go from sensor data to actual motion. Each step of the process from sensing, localization and mapping, path planning, and motion control is handled by a single, comprehensive software element that directly maps sensor inputs to driving actions.

These systems can be created with help of multiple different types of machine learning methods, such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Deep Learning, and Reinforcement Learning.

CNN's are mainly used to process images and spatial information to extract features of interest and identify objects in the environment.

RNNs are used to extract temporal information or to figure out how an object is moving in time. Such temporal information can be used by the self-driving car to correctly anticipate future actions of surrounding traffic and adjust its trajectory as needed.

DRL combines Deep Learning (DL) and Reinforcement Learning. DRL methods let software-defined ‘agents’ learn the best possible actions to achieve their goals in a virtual environment using a reward function.

Autonomous capabilities are achieved by training the system with colossal volumes of data

Machine learning algorithms are perfected when they are trained on data sets that represent realistic scenarios. Many open-source datasets made available by researchers and companies including Aptiv, Lyft, Waymo, and Baidu, are used in semantic segmentation. The segmentation labels each pixel of an image with a corresponding class of what is being represented such as street objects, sign classification, pedestrian detection, and depth prediction.

Autonomous vehicles rely on machine learning algorithms to not only perceive their environment but also to act on that data to control the car. Path planning can be taught to a CNN through imitation learning, in which the CNN tries to imitate the behavior of a driver from billions of hours of footage of real driving. In more advanced algorithms, DRL is used, where a reward is provided to the autonomous system for driving in an acceptable manner.

Localised computational power is nécessiteux

Training neural networks and inference during operations of the vehicle requires enormous computing power. Most machine learning tasks are executed on cloud-based infrastructure with large computing power and cooling. However, with autonomous vehicles, the use of the only cloud may not be possible as the vehicle needs to be able to simultaneously react to new data. As such, part of the processing required to operate the vehicle needs to take place onboard, while model refinements could be done on the cloud.

Recent advances in machine learning are focusing on how the huge amount of data generated by the sensors onboard of autonomous vehicles can be efficiently processed to reduce the computational cost, using concepts such as attention or core-sets. In addition, advances in chip manufacturing and miniaturization are increasing the computing capacity that can be mounted on an autonomous vehicle. With advances in networking protocols, cars might be able to rely on low-latency network-based processing of data to aid them in their autonomous operation.

Communicating and connecting with the environment

Autonomous vehicles won’t gain widespread acceptance until the riding public feels assured of their safety and security, not only of passengers but also other vehicles and pedestrians. Hence, Vehicle to Vehicle (V2V), vehicle to other road participants (V2P), and vehicle to traffic infrastructure (V2I) information sharing is critical for autonomous vehicles. This communication provides better traffic management by interacting with autonomous and non-autonomous traffic and improving pedestrian safety.

The communication systems needed can be summed under the umbrella term of Vehicle-to-Everything (V2X) communications. The technology used to achieve this is Dedicated Short-Range Communication (DSRC) or Cellular V2X or both. The communication between vehicles and other vehicles or devices is setup directly without network access through an interface called PC5. This interface is useful for basic safety services such as sudden braking warnings, or for traffic data collection. C-V2X also provides another communication interface called Uu, which allows the vehicle to communicate directly to the cellular network, a feature that DSRC does not provide.

Currently, C-V2X relies on fourth-generation (LTE/4G) mobile networks, It is fast enough for gaming or streaming content but lacks the speed and resilience required to sustain autonomous vehicle network operations. The arrival of 5G services can jet-speed the accuracy and reliability of V2X communication technology. The main advantages of 5G include greater data speeds (25–50% faster than 4G LTE), lower latency (25–40% lower than 4G LTE), and the ability to serve more devices. However, the security of V2X communications and regulatory challenges remain unresolved.

Beyond the communication standard, the cloud network architecture is also a key component of autonomous vehicles. In this space, the infrastructure already developed by companies such as Amazon AWS, Google Cloud, and Microsoft Azure for other applications is already mature enough to handle autonomous vehicle applications

Power, heat, weight, and size challenges are still a concern

Autonomous vehicles also face challenges on the power consumption, thermal footprint, weight, and size of the vehicle components. The prime driver for high power consumption is the computational requirements, requiring to process more lines of code than any software platform or operating system that has been created so far.

The thermal performance of the vehicle is also a necessary consideration, as increased processing demand and higher power throughput heat up the system. Cooling down electronic components and to keep them within certain temperature ranges, regardless of the vehicle’s external conditions is essential for the proper functioning of the system. But extra cooling systems - especially liquid-based ones, extra components, extra wiring all add to the weight and size of the vehicle. One way to compensate for this is by reducing the size of LIDARs and other semiconductor components.

Summary

In 2020, the state of autonomous vehicles is such that no technology is yet capable of Level-5, full automation. Level 4, or high automation, has achieved the ability to drive without human supervision and interference, albeit under strictly defined conditions. Autonomous vehicle technology is looming under many unforeseen challenges for technology developers and scaled back projections for automakers. In order to make an autonomous future a reality, significant collaboration throughout the auto and technology industry should be possible. Additionally, policymakers need to get on board with industry players in figuring out how to push this autonomous future-forward. Undoubtedly, once achieved, this technology will change the world beyond individual personal transportation including public transportation, delivery & cargo, and specialty vehicles for farming and mining.

References:

--

--

Shaan

Business Analyst and Startup enthusiast focused on Artificial Intelligence and Deep Learning