Emergent Mind

Evaluation of Large Language Models for Decision Making in Autonomous Driving

(2312.06351)
Published Dec 11, 2023 in cs.CV , cs.CL , and cs.RO

Abstract

Various methods have been proposed for utilizing LLMs in autonomous driving. One strategy of using LLMs for autonomous driving involves inputting surrounding objects as text prompts to the LLMs, along with their coordinate and velocity information, and then outputting the subsequent movements of the vehicle. When using LLMs for such purposes, capabilities such as spatial recognition and planning are essential. In particular, two foundational capabilities are required: (1) spatial-aware decision making, which is the ability to recognize space from coordinate information and make decisions to avoid collisions, and (2) the ability to adhere to traffic rules. However, quantitative research has not been conducted on how accurately different types of LLMs can handle these problems. In this study, we quantitatively evaluated these two abilities of LLMs in the context of autonomous driving. Furthermore, to conduct a Proof of Concept (POC) for the feasibility of implementing these abilities in actual vehicles, we developed a system that uses LLMs to drive a vehicle.

Overview

  • The paper explores LLMs' ability to make decisions in autonomous driving by understanding spatial relationships and adhering to traffic rules.

  • The study evaluates LLMs through simulations and actual vehicle tests focusing on spatial-aware decision-making (SADM) and compliance with traffic rules (FTR).

  • Three LLMs were tested: the public LLM LLaMA-2 7B, and the private models GPT-3.5 and GPT-4, with GPT-4 showing the highest accuracy across all metrics.

  • GPT-4 could rationalize decisions and prioritize correctly even when given incorrect human commands, indicating an advanced level of understanding.

  • The study acknowledges limitations such as internet communication, processing times, and the balance between accuracy and computational efficiency in real-world applications.

Introduction

The integration of LLMs into autonomous driving systems has spurred interest in understanding how well these models can comprehend spatial relationships and adhere to traffic rules when making decisions. As driving conditions are unpredictable, and scenarios often diverge from structured datasets, the potential of LLMs to use their extensive training from diverse textual data to navigate unfamiliar scenarios is particularly alluring. These models may also tackle higher-level tasks such as applying ethical reasoning in complex driving situations.

Methodology

The assessment of LLMs' decision-making abilities in autonomous driving is twofold: evaluating spatial-aware decision-making (SADM) and compliance with traffic rules (FTR). Investigations encompassed simulated real-world traffic scenarios for detailed analysis and an actual vehicle test to examine practical applications. In simulations, LLMs were presented with scenarios along a two-lane road and needed to consider vehicle positions and velocities, as well as traffic rules conveyed in natural language. They had to choose appropriate driving actions and explain their reasoning. In real-world experiments, an LLM guided a vehicle towards destinations while obeying commands from a traffic officer, prioritizing its spatial-awareness and rule-following capabilities.

Experiments and Results

Quantitative and qualitative evaluations were conducted on three LLMs with varying capabilities. Metrics included correct decision-making, such as when to accelerate or change lanes (SADM), adherence to speed limits and overtaking rules (FTR), and their combined challenge (SADM&FTR). The public LLM LLaMA-2 7B and the more powerful private models GPT-3.5 and GPT-4 were compared. GPT-4 emerged as significantly more accurate across all metrics and appeared to benefit from being prompted to provide reasoning. A qualitative review highlighted that GPT-4 could even override incorrect human instructions to comply with the speed limit, demonstrating an advanced understanding of priority in a given situation.

Conclusion and Limitations

Through comprehensive testing in simulation and real-world conditions, certain LLMs, especially GPT-4, showed remarkable precision in making spatial-aware decisions and following traffic legislation. However, their application in autonomous driving systems is not without challenges. Internet communication and processing times remain concerns for real-time decision-making. Additionally, the balancing act between decision-making accuracy and computational efficiency is a crucial consideration for incorporating LLMs into practical autonomous driving solutions. These findings underscore the significance of enhancing LLM capabilities for their application in the dynamic and complex field of autonomous driving.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.