Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles (2001.11231v1)

Published 30 Jan 2020 in cs.LG, cs.SY, eess.SY, and stat.ML

Abstract: Academic research in the field of autonomous vehicles has reached high popularity in recent years related to several topics as sensor technologies, V2X communications, safety, security, decision making, control, and even legal and standardization rules. Besides classic control design approaches, Artificial Intelligence and Machine Learning methods are present in almost all of these fields. Another part of research focuses on different layers of Motion Planning, such as strategic decisions, trajectory planning, and control. A wide range of techniques in Machine Learning itself have been developed, and this article describes one of these fields, Deep Reinforcement Learning (DRL). The paper provides insight into the hierarchical motion planning problem and describes the basics of DRL. The main elements of designing such a system are the modeling of the environment, the modeling abstractions, the description of the state and the perception models, the appropriate rewarding, and the realization of the underlying neural network. The paper describes vehicle models, simulation possibilities and computational requirements. Strategic decisions on different layers and the observation models, e.g., continuous and discrete state representations, grid-based, and camera-based solutions are presented. The paper surveys the state-of-art solutions systematized by the different tasks and levels of autonomous driving, such as car-following, lane-keeping, trajectory following, merging, or driving in dense traffic. Finally, open questions and future challenges are discussed.

Citations (384)

View on Semantic Scholar

Summary

The paper presents a comprehensive review of deep reinforcement learning for motion planning by structuring autonomous vehicle decisions into route, behavioral, and trajectory planning layers.
It analyzes diverse DRL methods such as Deep-Q Networks, Policy Gradient, and Actor-Critic approaches, demonstrating their application in simulation environments like CARLA, SUMO, and TORCS.
The paper addresses practical challenges including safety validation, transfer learning, and the integration of DRL with traditional control systems to enhance real-world autonomy.

Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles

The paper "Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles" offers a comprehensive review of how Deep Reinforcement Learning (DRL) is applied to various sub-tasks in the motion planning of autonomous vehicles. The survey situates itself in a rapidly developing field where autonomous vehicle research intersects with advances in artificial intelligence, particularly DRL, to improve decision-making processes in autonomous transportation systems. DRL combines the conceptual framework of reinforcement learning (RL) with the computational power of deep neural networks to provide a robust toolkit for developing adaptive and autonomous systems.

Hierarchical Motion Planning

The paper classifies motion planning into a hierarchical decision-making structure. This classification divides the problem into route planning, behavioral planning, and motion planning layers. The topmost layer, route planning, involves defining way-points based on pre-existing maps, although this layer typically does not incorporate RL techniques. Behavioral planning, meanwhile, involves setting short-term strategic policies such as car-following and lane-changing. These are often modeled as Partially Observable Markov Decision Processes (POMDPs), acknowledging that not all aspects of the environment are known to the agent. The motion planning layer then translates these strategies into actionable trajectories, considering nonholonomic vehicle dynamics which add complexity to real-time planning.

Reinforcement Learning Framework

RL approaches for autonomous vehicle systems include value-based methods like Deep-Q Networks (DQNs) and policy-based methods such as Policy Gradient Learning and Actor-Critic architectures. These methodologies allow agents to learn optimal policies by interacting with their environment to maximize cumulative rewards. Continuous action spaces typical of vehicular control may require variants like Deep Deterministic Policy Gradient (DDPG) to effectively manage the decision process.

Simulation Environments and Observations

A substantive portion of research employs simulation frameworks like CARLA, SUMO, and TORCS to model realistic traffic scenarios with diverse vehicle dynamics and complex environments. The complexity of accurate simulation is evident in the trade-off between model fidelity and computational efficiency. Autonomous systems employed in the real world could benefit from high-fidelity models but often at the cost of computational speed.

For observation models, the paper highlights the use of grid-based, camera-based, and Lidar-based input representations. The format of sensor data, whether unstructured image data or structured grid maps, influences the choice of neural network architecture. Convolutional Neural Networks (CNNs) commonly process pixel-based data, extracting salient features necessary for downstream decision-making processes.

Addressing Challenges and Open Questions

The paper ends by addressing challenges and open questions in applying DRL to motion planning. Current methodologies predominantly tackle specified sub-tasks like lane-keeping or car-following separately, exposing a divide between research problems and real-world applicability in complex traffic environments. Transfer learning, curriculum learning, safety validation, and integration with traditional control systems remain areas where additional work could translate into improvements in real-world autonomous vehicle systems.

The survey underscores that while significant theoretical and practical challenges remain, the potential for DRL methodologies to offer robust and adaptable frameworks for motion planning in autonomous vehicles is apparent. There is optimism within the research community that these technical challenges are surmountable with continued interdisciplinary efforts. Future research may well see DRL techniques more comprehensively integrated into adaptive and reliable autonomous driving systems.

PDF Markdown