A Benchmark for Unsupervised Anomaly Detection in Multi-Agent Trajectories

Published 5 Sep 2022 in cs.RO | (2209.01838v1)

Abstract: Human intuition allows to detect abnormal driving scenarios in situations they never experienced before. Like humans detect those abnormal situations and take countermeasures to prevent collisions, self-driving cars need anomaly detection mechanisms. However, the literature lacks a standard benchmark for the comparison of anomaly detection algorithms. We fill the gap and propose the R-U-MAAD benchmark for unsupervised anomaly detection in multi-agent trajectories. The goal is to learn a representation of the normal driving from the training sequences without labels, and afterwards detect anomalies. We use the Argoverse Motion Forecasting dataset for the training and propose a test dataset of 160 sequences with human-annotated anomalies in urban environments. To this end we combine a replay of real-world trajectories and scene-dependent abnormal driving in the simulation. In our experiments we compare 11 baselines including linear models, deep auto-encoders and one-class classification models using standard anomaly detection metrics. The deep reconstruction and end-to-end one-class methods show promising results. The benchmark and the baseline models will be publicly available.

Abstract PDF Upgrade to Chat

Authors (5)

Citations (6)

View on Semantic Scholar

Summary

The paper proposes the R-U-MAAD benchmark, a novel framework integrating simulation and real data to assess anomaly detection in multi-agent driving scenarios.
It augments the Argoverse dataset with human-labeled anomalies, categorizing behaviors into 9 normal maneuvers and 13 distinct abnormal classes.
Deep reconstruction models, especially STGAE, and one-class classifiers demonstrate robust performance with metrics like AUPR and AUROC.

A Benchmark for Unsupervised Anomaly Detection in Multi-Agent Trajectories

The paper "A Benchmark for Unsupervised Anomaly Detection in Multi-Agent Trajectories" proposes the R-U-MAAD benchmark to address the lack of a standardized framework for evaluating unsupervised anomaly detection (AD) techniques in the field of multi-agent driving behaviors. Using the Argoverse Motion Forecasting dataset as a foundation, the authors introduce an augmented test dataset with human-labeled anomalies tailored for urban traffic environments.

Motivation and Objectives

Self-driving vehicles must replicate human-like intuition to detect anomalous driving scenarios and take preventive measures accordingly. The absence of a comprehensive dataset for benchmarking AD models motivated the creation of the R-U-MAAD dataset. The benchmark's objective is to facilitate learning normal and abnormal driving behaviors using trajectories from the Argoverse dataset, both for model training without explicit anomaly labels and for testing with anomaly annotations. This work aims to set a baseline for AD in multi-agent trajectories using unsupervised learning methods.

Methodology and Dataset

The methodology hinges on hijacking a single vehicle from the Argoverse validation set to conduct scenario simulations in a multi-agent environment, thereby generating abnormal trajectory data through simulation while preserving normal driving sequences.

Figure 1: Our pipeline for abnormal trajectory generation, visualizing a scene from the HD-map with the target vehicle controlled for simulations.

A human operator induces abnormal scenarios, resulting in a dataset, R-U-MAAD, which includes individual and interaction-focused abnormalities. This dataset complements real-world trajectory data with simulation-derived anomalies.

Data Annotations

The paper meticulously annotates the driving behaviors as either normal with 9 distinct maneuvers, or abnormal with 13 unique anomaly classes such as ghost driving, aggressive maneuvers, and interactions leading to potential collisions.

Figure 2: Example scenarios highlighting normal and abnormal interactions, mapping lane configurations and agent trajectories in diverse driving conditions.

Model Architectures and Baselines

The authors evaluate a rich suite of baseline models comprising linear, deep reconstruction, and one-class classification methods. The reconstruction models, notably sequence-to-sequence (Seq2Seq) variations and spatio-temporal graph auto-encoders (STGAE), adaptively approximate the path trajectories. Comparatively, one-class classifiers utilize Support Vector Machines (SVM) and Deep Support Vector Data Description (DSVDD) objectives to learn separation between normal and anomalous trajectory representations.

Figure 3: Comprehensive flow of anomaly detection baselines, distinguishing between trajectory reconstruction and one-class classification approaches.

Performance Evaluation

Deep auto-encoders particularly the STGAE model, demonstrated proficiency in generating representative feature spaces of normal driving, thereby achieving commendable performance metrics like AUPR-Abnormal and AUROC.

Figure 4: Qualitative assessments indicate score transitions in scenarios, with notable increases during abnormal behaviors and consistent low scores otherwise.

Experimental Results

The benchmark tests demonstrated that deep learning models, especially with enhancements incorporating spatio-temporal and context features like LaneGCN-AE, exhibit superior accuracy and reliability over traditional linear modeling techniques. Metrics like AUPR consistently highlighted the deep networks' edge in predictive fidelity under unsupervised learning paradigms. However, the potential of exploiting context-derived information indicates prospective research directions for augmenting anomaly detection segments.

Conclusion

The R-U-MAAD benchmark facilitates robust evaluation for anomaly detection in multi-agent traffic scenarios by blending real-world data with simulation-generated anomalies. The proffered baseline models offer a substantive comparison point for future advancements in unsupervised AD systems. The presentation emphasizes expanding research directions to exploit interaction-aware models effectively, thus enhancing the robustness of autonomous vehicle systems in detecting and adapting to unusual driving conditions.

This benchmark is poised to stimulate ongoing research and pave the way towards safer and more predictive autonomous navigation in dynamic, multi-agent environments.

Markdown Report Issue