Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search (2202.06443v3)

Published 14 Feb 2022 in cs.LG and cs.RO

Abstract: Cooperative trajectory planning methods for automated vehicles can solve traffic scenarios that require a high degree of cooperation between traffic participants. However, for cooperative systems to integrate into human-centered traffic, the automated systems must behave human-like so that humans can anticipate the system's decisions. While Reinforcement Learning has made remarkable progress in solving the decision-making part, it is non-trivial to parameterize a reward model that yields predictable actions. This work employs feature-based Maximum Entropy Inverse Reinforcement Learning combined with Monte Carlo Tree Search to learn reward models that maximize the likelihood of recorded multi-agent cooperative expert trajectories. The evaluation demonstrates that the approach can recover a reasonable reward model that mimics the expert and performs similarly to a manually tuned baseline reward model.

Citations (5)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search (2202.06443v3)

Summary

Related Papers