Team Samsung-RAL: Technical Report for 2024 RoboDrive Challenge-Robust Map Segmentation Track (2405.10567v2)
Abstract: In this report, we describe the technical details of our submission to the 2024 RoboDrive Challenge Robust Map Segmentation Track. The Robust Map Segmentation track focuses on the segmentation of complex driving scene elements in BEV maps under varied driving conditions. Semantic map segmentation provides abundant and precise static environmental information crucial for autonomous driving systems' planning and navigation. While current methods excel in ideal circumstances, e.g., clear daytime conditions and fully functional sensors, their resilience to real-world challenges like adverse weather and sensor failures remains unclear, raising concerns about system safety. In this paper, we explored several methods to improve the robustness of the map segmentation task. The details are as follows: 1) Robustness analysis of utilizing temporal information; 2) Robustness analysis of utilizing different backbones; and 3) Data Augmentation to boost corruption robustness. Based on the evaluation results, we draw several important findings including 1) The temporal fusion module is effective in improving the robustness of the map segmentation model; 2) A strong backbone is effective for improving the corruption robustness; and 3) Some data augmentation methods are effective in improving the robustness of map segmentation models. These novel findings allowed us to achieve promising results in the 2024 RoboDrive Challenge-Robust Map Segmentation Track.
- nuscenes: A multimodal dataset for autonomous driving. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11618–11628, 2020.
- Benchmarking robustness of 3d object detection to common corruptions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1022–1032, 2023.
- Mbfusion: A new multi-modal bev feature fusion method for hd map construction. In IEEE International Conference on Robotics and Automation, 2024.
- Tri-perspective view for vision-based 3d semantic occupancy prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9223–9232, 2023.
- Robo3d: Towards robust and reliable 3d perception against corruptions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 19994–20006, 2023a.
- The robodepth challenge: Methods and advancements towards robust depth estimation. arXiv preprint arXiv:2307.15061, 2023b.
- The robodrive challenge: Drive anytime anywhere in any condition. arXiv preprint arXiv:2405.08816, 2024.
- Hdmapnet: An online hd map construction and evaluation framework. In IEEE International Conference on Robotics and Automation, pages 4628–4634, 2022a.
- Bevformer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. In European Conference on Computer Vision, pages 1–18, 2022b.
- Bevfusion: Multi-task multi-sensor fusion with unified bird’s eye view representation. In IEEE International Conference on Robotics and Automation, pages 2774–2781, 2023.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- Bevsegformer: Bird’s eye view semantic segmentation from arbitrary camera rigs. In IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5924–5932, 2023.
- Benchmarking and analyzing point cloud classification under corruptions. In International Conference on Machine Learning, pages 18559–18575, 2022.
- Surrounddepth: Entangling surrounding views for self-supervised multi-camera depth estimation. In Conference on Robot Learning, pages 539–549, 2023a.
- Surroundocc: Multi-camera 3d occupancy prediction for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 21729–21740, 2023b.
- Robobev: Towards robust bird’s eye view perception under corruptions. arXiv preprint arXiv:2304.06719, 2023.
- Second: Sparsely embedded convolutional detection. Sensors, 18(10):3337, 2018.
- Beverse: Unified perception and prediction in birds-eye-view for vision-centric autonomous driving. arXiv preprint arXiv:2205.09743, 2022.
- Unimix: Towards domain adaptive and generalizable lidar semantic segmentation in adverse weather. arXiv preprint arXiv:2404.05145, 2024a.
- Simdistill: Simulated multi-modal distillation for bev 3d object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 7460–7468, 2024b.
- Cross-view transformers for real-time map-view semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13760–13769, 2022.
- Objects as points. arXiv preprint arXiv:1904.07850, 2019.
- Understanding the robustness of 3d object detection with bird’s-eye-view representations in autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21600–21610, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.
 
          