Flow-guided Motion Prediction with Semantics and Dynamic Occupancy Grid Maps (2407.15675v1)
Abstract: Accurate prediction of driving scenes is essential for road safety and autonomous driving. Occupancy Grid Maps (OGMs) are commonly employed for scene prediction due to their structured spatial representation, flexibility across sensor modalities and integration of uncertainty. Recent studies have successfully combined OGMs with deep learning methods to predict the evolution of scene and learn complex behaviours. These methods, however, do not consider prediction of flow or velocity vectors in the scene. In this work, we propose a novel multi-task framework that leverages dynamic OGMs and semantic information to predict both future vehicle semantic grids and the future flow of the scene. This incorporation of semantic flow not only offers intermediate scene features but also enables the generation of warped semantic grids. Evaluation on the real-world NuScenes dataset demonstrates improved prediction capabilities and enhanced ability of the model to retain dynamic vehicles within the scene.
- L. Rummelhard, J.-A. David, A. G. Moreno, and C. Laugier, “A cross-prediction, hidden-state-augmented approach for dynamic occupancy grid filtering,” in ICARCV, 2022.
- M. Toyungyernsub, E. Yel, J. Li, and M. J. Kochenderfer, “Dynamics-aware spatiotemporal occupancy prediction in urban environments,” in IROS, 2022.
- K. S. Mann, A. Tomy, A. Paigwar, A. Renzaglia, and C. Laugier, “Predicting future occupancy grids in dynamic environment with spatio-temporal learning,” in IV, 2022.
- R. Asghar, L. Rummelhard, A. Spalanzani, and C. Laugier, “Allo-centric occupancy grid prediction for urban traffic scene using video prediction networks,” in ICARCV, 2022.
- R. Asghar, M. Diaz-Zapata, L. Rummelhard, A. Spalanzani, and C. Laugier, “Vehicle motion forecasting using prior information and semantic-assisted occupancy grid maps,” in IROS, 2023.
- H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” in CVPR, 2020.
- H.-S. Jeon, D.-S. Kum, and W.-Y. Jeong, “Traffic Scene Prediction via Deep Learning: Introduction of Multi-Channel Occupancy Grid Map as a Scene Representation,” in IV, 2018.
- J. Dequaire, P. Ondrúška, D. Rao, D. Wang, and I. Posner, “Deep tracking in the wild: End-to-end tracking using recurrent neural networks,” in IJRR, 2017.
- N. Mohajerin and M. Rohani, “Multi-step prediction of occupancy grid maps with recurrent neural networks,” in CVPR, 2019.
- B. Lange, M. Itkina, and M. J. Kochenderfer, “Attention augmented convlstm for environment prediction,” in IROS, 2021.
- M. Schreiber, S. Hoermann, and K. Dietmayer, “Long-term occupancy grid prediction using recurrent neural networks,” in ICRA, 2019.
- J. Hong, B. Sapp, and J. Philbin, “Rules of the road: Predicting driving behavior with a convolutional model of semantic interactions,” in CVPR, 2019.
- P. Wu, S. Chen, and D. N. Metaxas, “Motionnet: Joint perception and motion prediction for autonomous driving based on bird’s eye view maps,” in CVPR, 2020.
- A. Terwilliger, G. Brazil, and X. Liu, “Recurrent flow-guided semantic forecasting,” in WACV, 2019.
- S. Casas, A. Sadat, and R. Urtasun, “Mp3: A unified model to map, perceive, predict and plan,” in CVPR, 2021.
- A. Hu, Z. Murez, N. Mohan, S. Dudas, J. Hawke, V. Badrinarayanan, R. Cipolla, and A. Kendall, “FIERY: Future instance segmentation in bird’s-eye view from surround monocular cameras,” in ICCV, 2021.
- R. Mahjourian, J. Kim, Y. Chai, M. Tan, B. Sapp, and D. Anguelov, “Occupancy flow fields for motion forecasting in autonomous driving,” in IEEE RAL, 2022.
- “Occupancy and flow prediction challenge,” https://waymo.com/open/challenges/terms/?continue=%2Fopen%2Fchallenges%2F2024%2Foccupancy-flow-prediction%2F, accessed: 2024-04-15.
- J. Philion and S. Fidler, “Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d,” in ECCV, 2020.
- A. Hu, F. Cotter, N. Mohan, C. Gurau, and A. Kendall, “Probabilistic future prediction for video scene understanding,” in ECCV, 2020.
- Y. Wang, H. Wu, J. Zhang, Z. Gao, J. Wang, S. Y. Philip, and M. Long, “Predrnn: A recurrent neural network for spatiotemporal predictive learning,” in TPAMI, 2022.