Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving (2401.15315v2)

Published 27 Jan 2024 in cs.RO

Abstract: Effective decision-making in autonomous driving relies on accurate inference of other traffic agents' future behaviors. To achieve this, we propose an online belief-update-based behavior prediction model and an efficient planner for Partially Observable Markov Decision Processes (POMDPs). We develop a Transformer-based prediction model, enhanced with a recurrent neural memory model, to dynamically update latent belief state and infer the intentions of other agents. The model can also integrate the ego vehicle's intentions to reflect closed-loop interactions among agents, and it learns from both offline data and online interactions. For planning, we employ a Monte-Carlo Tree Search (MCTS) planner with macro actions, which reduces computational complexity by searching over temporally extended action steps. Inside the MCTS planner, we use predicted long-term multi-modal trajectories to approximate future updates, which eliminates iterative belief updating and improves the running efficiency. Our approach also incorporates deep Q-learning (DQN) as a search prior, which significantly improves the performance of the MCTS planner. Experimental results from simulated environments validate the effectiveness of our proposed method. The online belief update model can significantly enhance the accuracy and temporal consistency of predictions, leading to improved decision-making performance. Employing DQN as a search prior in the MCTS planner considerably boosts its performance and outperforms an imitation learning-based prior. Additionally, we show that the MCTS planning with macro actions substantially outperforms the vanilla method in terms of performance and efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. C. Hubmann, J. Schulz, M. Becker, D. Althoff, and C. Stiller, “Automated driving in uncertain environments: Planning with interaction and uncertain maneuver prediction,” IEEE transactions on intelligent vehicles, vol. 3, no. 1, pp. 5–17, 2018.
  2. K. Brown, K. Driggs-Campbell, and M. J. Kochenderfer, “A taxonomy and review of algorithms for modeling and predicting human driver behavior,” arXiv preprint arXiv:2006.08832, 2020.
  3. W. Ding, L. Zhang, J. Chen, and S. Shen, “Epsilon: An efficient planning system for automated vehicles in highly interactive environments,” IEEE Transactions on Robotics, vol. 38, no. 2, pp. 1118–1138, 2021.
  4. C. Tang, W. Zhan, and M. Tomizuka, “Interventional behavior prediction: Avoiding overly confident anticipation in interactive prediction,” arXiv preprint arXiv:2204.08665, 2022.
  5. E. Tolstaya, R. Mahjourian, C. Downey, B. Vadarajan, B. Sapp, and D. Anguelov, “Identifying driver interactions via conditional behavior prediction,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 3473–3479.
  6. Z. Huang, H. Liu, J. Wu, and C. Lv, “Conditional predictive behavior planning with inverse reinforcement learning for human-like autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, 2023.
  7. R. Chekroun, T. Gilles, M. Toromanoff, S. Hornauer, and F. Moutarde, “Mbappe: Mcts-built-around prediction for planning explicitly,” arXiv preprint arXiv:2309.08452, 2023.
  8. C. Li, T. Trinh, L. Wang, C. Liu, M. Tomizuka, and W. Zhan, “Efficient game-theoretic planning with prediction heuristic for socially-compliant autonomous driving,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 10 248–10 255, 2022.
  9. D. Lenz, T. Kessler, and A. Knoll, “Tactical cooperative planning for autonomous highway driving using monte-carlo tree search,” in 2016 IEEE Intelligent Vehicles Symposium (IV).   IEEE, 2016, pp. 447–453.
  10. T. Zhou, E. Lyu, J. Wang, G. Cen, Z. Zha, S. Qi, and M. Q.-H. Meng, “Towards high efficient long-horizon planning with expert-guided motion-encoding tree search,” arXiv preprint arXiv:2309.15079, 2023.
  11. M. De Waard, D. M. Roijers, and S. C. Bakkes, “Monte carlo tree search with options for general video game playing,” in 2016 IEEE Conference on Computational Intelligence and Games (CIG).   IEEE, 2016, pp. 1–8.
  12. S. Shi, L. Jiang, D. Dai, and B. Schiele, “Mtr++: Multi-agent motion prediction with symmetric scene modeling and guided intention querying,” arXiv preprint arXiv:2306.17770, 2023.
  13. N. Nayakanti, R. Al-Rfou, A. Zhou, K. Goel, K. S. Refaat, and B. Sapp, “Wayformer: Motion forecasting via simple & efficient attention networks,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 2980–2987.
  14. Z. Zhou, J. Wang, Y.-H. Li, and Y.-K. Huang, “Query-centric trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 17 863–17 873.
  15. M. Ye, J. Xu, X. Xu, T. Wang, T. Cao, and Q. Chen, “Bootstrap motion forecasting with self-consistent constraints,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 8504–8514.
  16. Z. Pang, D. Ramanan, M. Li, and Y.-X. Wang, “Streaming motion forecasting for autonomous driving,” arXiv preprint arXiv:2310.01351, 2023.
  17. Z. Huang, P. Karkus, B. Ivanovic, Y. Chen, M. Pavone, and C. Lv, “Dtpp: Differentiable joint conditional prediction and cost evaluation for tree policy planning in autonomous driving,” arXiv preprint arXiv:2310.05885, 2023.
  18. H. Song, W. Ding, Y. Chen, S. Shen, M. Y. Wang, and Q. Chen, “Pip: Planning-informed trajectory prediction for autonomous driving,” in European Conference on Computer Vision.   Springer, 2020, pp. 598–614.
  19. J. L. V. Espinoza, A. Liniger, W. Schwarting, D. Rus, and L. Van Gool, “Deep interactive motion prediction and planning: Playing games with motion prediction models,” in Learning for Dynamics and Control Conference.   PMLR, 2022, pp. 1006–1019.
  20. Z. Sunberg and M. J. Kochenderfer, “Improving automated driving through pomdp planning with human internal states,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 11, pp. 20 073–20 083, 2022.
  21. S. Dai, S. Bae, and D. Isele, “Game theoretic decision making by actively learning human intentions applied on autonomous driving,” arXiv preprint arXiv:2301.09178, 2023.
  22. M. Hausknecht and P. Stone, “Deep recurrent q-learning for partially observable mdps,” in 2015 aaai fall symposium series, 2015.
  23. P. Karkus, D. Hsu, and W. S. Lee, “Qmdp-net: Deep learning for planning under partial observability,” Advances in neural information processing systems, vol. 30, 2017.
  24. Y. Han and P. Gmytrasiewicz, “Ipomdp-net: A deep neural network for partially observable multi-agent planning using interactive pomdps,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 6062–6069.
  25. M. Igl, L. Zintgraf, T. A. Le, F. Wood, and S. Whiteson, “Deep variational reinforcement learning for pomdps,” in International Conference on Machine Learning.   PMLR, 2018, pp. 2117–2126.
  26. M. Świechowski, K. Godlewski, B. Sawicki, and J. Mańdziuk, “Monte carlo tree search: A review of recent modifications and applications,” Artificial Intelligence Review, vol. 56, no. 3, pp. 2497–2562, 2023.
  27. T. V. Baby and B. HomChaudhuri, “Monte carlo tree search based trajectory generation for automated vehicles in interactive traffic environments,” in 2023 American Control Conference (ACC).   IEEE, 2023, pp. 4431–4436.
  28. J. Schrittwieser, I. Antonoglou, T. Hubert, K. Simonyan, L. Sifre, S. Schmitt, A. Guez, E. Lockhart, D. Hassabis, T. Graepel et al., “Mastering atari, go, chess and shogi by planning with a learned model,” Nature, vol. 588, no. 7839, pp. 604–609, 2020.
  29. P.-L. Bacon, J. Harb, and D. Precup, “The option-critic architecture,” in Proceedings of the AAAI conference on artificial intelligence, vol. 31, no. 1, 2017.
  30. Y. Zhang, H. Sun, J. Zhou, J. Pan, J. Hu, and J. Miao, “Optimal vehicle path planning using quadratic optimization for baidu apollo open platform,” in 2020 IEEE Intelligent Vehicles Symposium (IV).   IEEE, 2020, pp. 978–984.
  31. S. Fujimoto, H. Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in International conference on machine learning.   PMLR, 2018, pp. 1587–1596.
  32. S. Ettinger, S. Cheng, B. Caine, C. Liu, H. Zhao, S. Pradhan, Y. Chai, B. Sapp, C. R. Qi, Y. Zhou, Z. Yang, A. Chouard, P. Sun, J. Ngiam, V. Vasudevan, A. McCauley, J. Shlens, and D. Anguelov, “Large scale interactive motion forecasting for autonomous driving: The waymo open motion dataset,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9710–9719.
  33. Q. Li, Z. Peng, L. Feng, Q. Zhang, Z. Xue, and B. Zhou, “Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning,” IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 3, pp. 3461–3475, 2022.
  34. C. Zhang and L. Sun, “Bayesian calibration of the intelligent driver model,” arXiv preprint arXiv:2210.03571, 2022.
  35. Z. Huang, H. Liu, J. Wu, and C. Lv, “Differentiable integrated motion prediction and planning with learnable cost function for autonomous driving,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
Citations (3)

Summary

We haven't generated a summary for this paper yet.