Evaluation of Large Language Models for Decision Making in Autonomous Driving (2312.06351v1)

Published 11 Dec 2023 in cs.CV, cs.CL, and cs.RO

Abstract: Various methods have been proposed for utilizing LLMs in autonomous driving. One strategy of using LLMs for autonomous driving involves inputting surrounding objects as text prompts to the LLMs, along with their coordinate and velocity information, and then outputting the subsequent movements of the vehicle. When using LLMs for such purposes, capabilities such as spatial recognition and planning are essential. In particular, two foundational capabilities are required: (1) spatial-aware decision making, which is the ability to recognize space from coordinate information and make decisions to avoid collisions, and (2) the ability to adhere to traffic rules. However, quantitative research has not been conducted on how accurately different types of LLMs can handle these problems. In this study, we quantitatively evaluated these two abilities of LLMs in the context of autonomous driving. Furthermore, to conduct a Proof of Concept (POC) for the feasibility of implementing these abilities in actual vehicles, we developed a system that uses LLMs to drive a vehicle.

References (12)

Citations (8)

View on Semantic Scholar

Summary

The paper demonstrates that GPT-4’s advanced reasoning leads to superior spatial-aware decisions and precise traffic rule compliance compared to other LLMs.
It employs a dual-method evaluation using simulated scenarios and real-world tests to measure decision-making accuracy in dynamic driving environments.
The study highlights challenges such as communication latency and computational efficiency, emphasizing the need for prompt-guided reasoning in practical autonomous driving applications.

Introduction

The integration of LLMs into autonomous driving systems has spurred interest in understanding how well these models can comprehend spatial relationships and adhere to traffic rules when making decisions. As driving conditions are unpredictable, and scenarios often diverge from structured datasets, the potential of LLMs to use their extensive training from diverse textual data to navigate unfamiliar scenarios is particularly alluring. These models may also tackle higher-level tasks such as applying ethical reasoning in complex driving situations.

Methodology

The assessment of LLMs' decision-making abilities in autonomous driving is twofold: evaluating spatial-aware decision-making (SADM) and compliance with traffic rules (FTR). Investigations encompassed simulated real-world traffic scenarios for detailed analysis and an actual vehicle test to examine practical applications. In simulations, LLMs were presented with scenarios along a two-lane road and needed to consider vehicle positions and velocities, as well as traffic rules conveyed in natural language. They had to choose appropriate driving actions and explain their reasoning. In real-world experiments, an LLM guided a vehicle towards destinations while obeying commands from a traffic officer, prioritizing its spatial-awareness and rule-following capabilities.

Experiments and Results

Quantitative and qualitative evaluations were conducted on three LLMs with varying capabilities. Metrics included correct decision-making, such as when to accelerate or change lanes (SADM), adherence to speed limits and overtaking rules (FTR), and their combined challenge (SADM&FTR). The public LLM LLaMA-2 7B and the more powerful private models GPT-3.5 and GPT-4 were compared. GPT-4 emerged as significantly more accurate across all metrics and appeared to benefit from being prompted to provide reasoning. A qualitative review highlighted that GPT-4 could even override incorrect human instructions to comply with the speed limit, demonstrating an advanced understanding of priority in a given situation.

Conclusion and Limitations

Through comprehensive testing in simulation and real-world conditions, certain LLMs, especially GPT-4, showed remarkable precision in making spatial-aware decisions and following traffic legislation. However, their application in autonomous driving systems is not without challenges. Internet communication and processing times remain concerns for real-time decision-making. Additionally, the balancing act between decision-making accuracy and computational efficiency is a crucial consideration for incorporating LLMs into practical autonomous driving solutions. These findings underscore the significance of enhancing LLM capabilities for their application in the dynamic and complex field of autonomous driving.

PDF Markdown

Related Papers

Tweets

https://twitter.com/1734206205311238144/status/1734588392191729930