Weakly Supervised LiDAR Semantic Segmentation via Scatter Image Annotation (2404.12861v2)
Abstract: Weakly supervised LiDAR semantic segmentation has made significant strides with limited labeled data. However, most existing methods focus on the network training under weak supervision, while efficient annotation strategies remain largely unexplored. To tackle this gap, we implement LiDAR semantic segmentation using scatter image annotation, effectively integrating an efficient annotation strategy with network training. Specifically, we propose employing scatter images to annotate LiDAR point clouds, combining a pre-trained optical flow estimation network with a foundation image segmentation model to rapidly propagate manual annotations into dense labels for both images and point clouds. Moreover, we propose ScatterNet, a network that includes three pivotal strategies to reduce the performance gap caused by such annotations. Firstly, it utilizes dense semantic labels as supervision for the image branch, alleviating the modality imbalance between point clouds and images. Secondly, an intermediate fusion branch is proposed to obtain multimodal texture and structural features. Lastly, a perception consistency loss is introduced to determine which information needs to be fused and which needs to be discarded during the fusion process. Extensive experiments on the nuScenes and SemanticKITTI datasets have demonstrated that our method requires less than 0.02% of the labeled points to achieve over 95% of the performance of fully-supervised methods. Notably, our labeled points are only 5% of those used in the most advanced weakly supervised methods.
- Exploring dual representations in large-scale point clouds: A simple weakly supervised semantic segmentation framework. In Proceedings of the 31st ACM International Conference on Multimedia, pages 2371–2380, 2023.
- An mil-derived transformer for weakly supervised point cloud segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11830–11839, 2022.
- Weakly supervised learning for point cloud semantic segmentation with dual teacher. IEEE Robotics and Automation Letters, 2023.
- Weakly supervised 3d point cloud segmentation via multi-prototype learning. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
- Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4015–4026, 2023.
- Learning 3d semantic segmentation with only 2d image supervision. In 2021 International Conference on 3D Vision (3DV), pages 361–372. IEEE, 2021.
- Weakly supervised semantic segmentation in 3d graph-structured point clouds of wild scenes. arXiv preprint arXiv:2004.12498, 2020.
- Data augmented 3d semantic scene completion with 2d segmentation priors. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3781–3790, 2022.
- Semaffinet: Semantic-affine transformation for point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11819–11829, 2022.
- Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9297–9307, 2019.
- nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017.
- Lsanet: Feature learning on point sets by local spatial aware layer. arXiv preprint arXiv:1905.05442, 2019.
- Kang Zhiheng and Li Ning. Pyramnet: Point cloud pyramid attention network and graph embedding module for classification and segmentation. arXiv preprint arXiv:1906.03299, 2019.
- Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems, 31, 2018.
- A-cnn: Annularly convolutional neural networks on point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7421–7430, 2019.
- Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6411–6420, 2019.
- Deep convolutional networks on 3d point clouds. ieee. In CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9613–9622, 2019.
- Paconv: Position adaptive convolution with dynamic kernel assembling on point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3173–3182, 2021.
- Exploring spatial context for 3d semantic segmentation of point clouds. In Proceedings of the IEEE international conference on computer vision workshops, pages 716–724, 2017.
- Adaptive graph convolutional neural networks. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
- Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog), 38(5):1–12, 2019.
- Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
- Deep projective 3d semantic segmentation. In Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I 17, pages 95–107. Springer, 2017.
- Unstructured point cloud semantic labeling using deep segmentation networks. 3dor@ eurographics, 3:17–24, 2017.
- Crossmodal few-shot 3d point cloud semantic segmentation. In Proceedings of the 30th ACM International Conference on Multimedia, pages 4760–4768, 2022.
- Salsanet: Fast road and vehicle segmentation in lidar point clouds for autonomous driving. In 2020 IEEE intelligent vehicles symposium (IV), pages 926–932. IEEE, 2020.
- 3d-mininet: Learning a 2d representation from point clouds for fast and efficient 3d lidar semantic segmentation. IEEE Robotics and Automation Letters, 5(4):5432–5439, 2020.
- Salsanext: Fast, uncertainty-aware semantic segmentation of lidar point clouds. In Advances in Visual Computing: 15th International Symposium, ISVC 2020, San Diego, CA, USA, October 5–7, 2020, Proceedings, Part II 15, pages 207–222. Springer, 2020.
- Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In 2019 International Conference on Robotics and Automation (ICRA), pages 4376–4382. IEEE, 2019.
- Tornado-net: multiview total variation semantic segmentation with diamond inception module. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 9543–9549. IEEE, 2021.
- Amvnet: Assertion-based multi-view fusion network for lidar semantic segmentation. arXiv preprint arXiv:2012.04934, 2020.
- 3d semantic segmentation with submanifold sparse convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9224–9232, 2018.
- Cylinder3d: An effective 3d framework for driving-scene lidar semantic segmentation. arXiv preprint arXiv:2008.01550, 2020.
- 2-s3net: Attentive feature fusion with adaptive feature selection for sparse semantic segmentation network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12547–12556, 2021.
- Searching efficient 3d architectures with sparse point-voxel convolution. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVIII, pages 685–702. Springer, 2020.
- Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16024–16033, 2021.
- Multi-task multi-sensor fusion for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7345–7353, 2019.
- Deep continuous fusion for multi-sensor 3d object detection. In Proceedings of the European conference on computer vision (ECCV), pages 641–656, 2018.
- Perception-aware multi-sensor fusion for 3d lidar semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16280–16290, 2021.
- xmuda: Cross-modal unsupervised domain adaptation for 3d semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12605–12614, 2020.
- 2dpass: 2d priors assisted semantic segmentation on lidar point clouds. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVIII, pages 677–695. Springer, 2022.
- A survey on weakly supervised 3d point cloud semantic segmentation. IET Computer Vision, 2023.
- One thing one click: A self-training approach for weakly supervised 3d semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1726–1736, 2021.
- Weakly supervised semantic segmentation for large-scale point cloud. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 3421–3429, 2021.
- Multi-path region mining for weakly supervised 3d semantic segmentation on point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4384–4393, 2020.
- Multi-modality affinity inference for weakly supervised 3d semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 3216–3224, 2024.
- Sqn: Weakly-supervised semantic segmentation of large-scale 3d point clouds. In European Conference on Computer Vision, pages 600–619. Springer, 2022.
- Less: Label-efficient semantic segmentation for lidar point clouds. In European conference on computer vision, pages 70–89. Springer, 2022.
- Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016.
- Dual adaptive transformations for weakly supervised point cloud segmentation. In European conference on computer vision, pages 78–96. Springer, 2022.
- Image understands point cloud: Weakly supervised 3d semantic segmentation via association learning. IEEE Transactions on Image Processing, 2024.
- Hybridcr: Weakly-supervised 3d point cloud semantic segmentation via hybrid contrastive regularization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14930–14939, 2022.
- Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Alfréd Rényi. On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, volume 4, pages 547–562. University of California Press, 1961.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- The lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4413–4421, 2018.
- 4d spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3075–3084, 2019.
- Spherical transformer for lidar-based 3d recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17545–17555, 2023.
- Scribble-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2697–2707, 2022.
- Redal: Region-based and diversity-aware active learning for point cloud semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 15510–15519, 2021.
- Image-to-lidar self-supervised distillation for autonomous driving data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9891–9901, 2022.