SIMMF: Semantics-aware Interactive Multiagent Motion Forecasting for Autonomous Vehicle Driving (2306.14941v2)
Abstract: Autonomous vehicles require motion forecasting of their surrounding multiagents (pedestrians and vehicles) to make optimal decisions for navigation. The existing methods focus on techniques to utilize the positions and velocities of these agents and fail to capture semantic information from the scene. Moreover, to mitigate the increase in computational complexity associated with the number of agents in the scene, some works leverage Euclidean distance to prune far-away agents. However, distance-based metric alone is insufficient to select relevant agents and accurately perform their predictions. To resolve these issues, we propose the Semantics-aware Interactive Multiagent Motion Forecasting (SIMMF) method to capture semantics along with spatial information and optimally select relevant agents for motion prediction. Specifically, we achieve this by implementing a semantic-aware selection of relevant agents from the scene and passing them through an attention mechanism to extract global encodings. These encodings along with agents' local information, are passed through an encoder to obtain time-dependent latent variables for a motion policy predicting the future trajectories. Our results show that the proposed approach outperforms state-of-the-art baselines and provides more accurate and scene-consistent predictions.
- C. Castro, Human factors of visual and cognitive performance in driving. CRC Press, 2008.
- R. Tyrrell, K. Rudolph, B. Eggers, and H. Leibowitz, “Evidence for the persistence of visual guidance information,” Perception & psychophysics, vol. 54, no. 4, pp. 431–438, 1993.
- S. Lefèvre, D. Vasquez, and C. Laugier, “A survey on motion prediction and risk assessment for intelligent vehicles,” ROBOMECH journal, vol. 1, no. 1, pp. 1–14, 2014.
- J. M. Wang, D. J. Fleet, and A. Hertzmann, “Gaussian process dynamical models for human motion,” IEEE transactions on pattern analysis and machine intelligence, vol. 30, no. 2, pp. 283–298, 2007.
- J. Morton, T. A. Wheeler, and M. J. Kochenderfer, “Analysis of recurrent neural networks for probabilistic modeling of driver behavior,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 5, pp. 1289–1298, 2016.
- A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in 2018 IEEE international Conference on Robotics and Automation (ICRA), pp. 4601–4607, IEEE, 2018.
- X. Jia, L. Chen, P. Wu, J. Zeng, J. Yan, H. Li, and Y. Qiao, “Towards capturing the temporal dynamics for trajectory prediction: a coarse-to-fine approach,” in Conference on Robot Learning, pp. 910–920, PMLR, 2023.
- F. Leon and M. Gavrilescu, “A review of tracking and trajectory prediction methods for autonomous driving,” Mathematics, vol. 9, no. 6, p. 660, 2021.
- A. Rudenko, L. Palmieri, M. Herman, K. M. Kitani, D. M. Gavrila, and K. O. Arras, “Human motion trajectory prediction: A survey,” The International Journal of Robotics Research, vol. 39, no. 8, pp. 895–935, 2020.
- N. Nikhil and B. Tran Morris, “Convolutional neural network for trajectory prediction,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 0–0, 2018.
- H. Hosseini, S. Kannan, B. Zhang, and R. Poovendran, “Learning temporal dependence from time-series data with latent variables,” in 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 253–262, IEEE, 2016.
- R. Girgis, F. Golemo, F. Codevilla, M. Weiss, J. A. D’Souza, S. E. Kahou, F. Heide, and C. Pal, “Latent variable sequential set transformers for joint multi-agent motion prediction,” arXiv preprint arXiv:2104.00563, 2021.
- A. Jain, A. R. Zamir, S. Savarese, and A. Saxena, “Structural-rnn: Deep learning on spatio-temporal graphs,” in Proceedings of the ieee conference on computer vision and pattern recognition, pp. 5308–5317, 2016.
- N. Deo, E. Wolff, and O. Beijbom, “Multimodal trajectory prediction conditioned on lane-graph traversals,” in Conference on Robot Learning, pp. 203–212, PMLR, 2022.
- M. Liu, H. Cheng, L. Chen, H. Broszio, J. Li, R. Zhao, M. Sester, and M. Y. Yang, “Laformer: Trajectory prediction for autonomous driving with lane-aware scene constraints,” arXiv preprint arXiv:2302.13933, 2023.
- Y. Chen, B. Ivanovic, and M. Pavone, “Scept: Scene-consistent, policy-based trajectory predictions for planning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17103–17112, 2022.
- H. Herencia-Zapana, J.-B. Jeannin, and C. A. Munoz, “Formal verification of safety buffers for sate-based conflict detection and resolution,” in 27th International Congress of the Aeronautical Sciences (ICAS 2010), no. NF1676L-9213, 2010.
- V. Nguyen, H. Kim, S. Jun, and K. Boo, “A study on real-time detection method of lane and vehicle for lane change assistant system using vision system on highway,” Engineering science and technology, an international journal, vol. 21, no. 5, pp. 822–833, 2018.
- S. Sivaraman and M. M. Trivedi, “Integrated lane and vehicle detection, localization, and tracking: A synergistic approach,” IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 2, pp. 906–917, 2013.
- T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVIII 16, pp. 683–700, Springer, 2020.
- H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11621–11631, 2020.
- Y. Yuan, X. Weng, Y. Ou, and K. M. Kitani, “Agentformer: Agent-aware transformers for socio-temporal multi-agent forecasting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9813–9823, 2021.
- S. Ettinger, S. Cheng, B. Caine, C. Liu, H. Zhao, S. Pradhan, Y. Chai, B. Sapp, C. R. Qi, Y. Zhou, et al., “Large scale interactive motion forecasting for autonomous driving: The waymo open motion dataset,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9710–9719, 2021.
- A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 961–971, 2016.
- N. Deo and M. M. Trivedi, “Multi-modal trajectory prediction of surrounding vehicles with maneuver based lstms,” in 2018 IEEE intelligent vehicles symposium (IV), pp. 1179–1184, IEEE, 2018.
- A. Sadeghian, F. Legros, M. Voisin, R. Vesel, A. Alahi, and S. Savarese, “Car-net: Clairvoyant attentive recurrent network,” in Proceedings of the European conference on computer vision (ECCV), pp. 151–167, 2018.
- S. Casas, C. Gulino, R. Liao, and R. Urtasun, “Spatially-aware graph neural networks for relational behavior forecasting from sensor data,” arXiv preprint arXiv:1910.08233, 2019.