Towards Dynamic and Small Objects Refinement for Unsupervised Domain Adaptative Nighttime Semantic Segmentation (2310.04747v2)
Abstract: Nighttime semantic segmentation plays a crucial role in practical applications, such as autonomous driving, where it frequently encounters difficulties caused by inadequate illumination conditions and the absence of well-annotated datasets. Moreover, semantic segmentation models trained on daytime datasets often face difficulties in generalizing effectively to nighttime conditions. Unsupervised domain adaptation (UDA) has shown the potential to address the challenges and achieved remarkable results for nighttime semantic segmentation. However, existing methods still face limitations in 1) their reliance on style transfer or relighting models, which struggle to generalize to complex nighttime environments, and 2) their ignorance of dynamic and small objects like vehicles and poles, which are difficult to be directly learned from other domains. This paper proposes a novel UDA method that refines both label and feature levels for dynamic and small objects for nighttime semantic segmentation. First, we propose a dynamic and small object refinement module to complement the knowledge of dynamic and small objects from the source domain to target the nighttime domain. These dynamic and small objects are normally context-inconsistent in under-exposed conditions. Then, we design a feature prototype alignment module to reduce the domain gap by deploying contrastive learning between features and prototypes of the same class from different domains, while re-weighting the categories of dynamic and small objects. Extensive experiments on three benchmark datasets demonstrate that our method outperforms prior arts by a large margin for nighttime segmentation. Project page: https://rorisis.github.io/DSRNSS/.
- M. Hofmarcher, T. Unterthiner, J. Arjona-Medina, G. Klambauer, S. Hochreiter, and B. Nessler, “Visual scene understanding for autonomous driving using semantic segmentation,” Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pp. 285–296, 2019.
- M. Siam, M. Gamal, M. Abdel-Razek, S. Yogamani, M. Jagersand, and H. Zhang, “A comparative study of real-time semantic segmentation for autonomous driving,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 587–597.
- H. Blum, P.-E. Sarlin, J. Nieto, R. Siegwart, and C. Cadena, “Fishyscapes: A benchmark for safe semantic segmentation in autonomous driving,” in proceedings of the IEEE/CVF international conference on computer vision workshops, 2019, pp. 1–10.
- J. Liu, X. Guo, B. Li, and Y. Yuan, “Coinet: Adaptive segmentation with co-interactive network for autonomous driving,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 4800–4806.
- C. Malone, S. Garg, M. Xu, T. Peynot, and M. Milford, “Improving road segmentation in challenging domains using similar place priors,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3555–3562, 2022.
- G. Roggiolani, M. Sodano, T. Guadagnino, F. Magistri, J. Behley, and C. Stachniss, “Hierarchical approach for joint semantic, plant instance, and leaf instance segmentation in the agricultural domain,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 9601–9607.
- M. Kerkech, A. Hafiane, and R. Canals, “Vine disease detection in uav multispectral images using optimized image registration and deep learning segmentation approach,” Computers and Electronics in Agriculture, vol. 174, p. 105446, 2020.
- S. Palazzo, D. C. Guastella, L. Cantelli, P. Spadaro, F. Rundo, G. Muscato, D. Giordano, and C. Spampinato, “Domain adaptation for outdoor robot traversability estimation from rgb data with safety-preserving loss,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 10 014–10 021.
- S. Liu, J. Cheng, L. Liang, H. Bai, and W. Dang, “Light-weight semantic segmentation network for uav remote sensing images,” Ieee Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 8287–8296, 2021.
- K. Katuwandeniya, S. H. Kiss, L. Shi, and J. V. Miro, “Multi-modal scene-compliant user intention estimation in navigation,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 1001–1006.
- Y. Zou, Z. Yu, B. Kumar, and J. Wang, “Unsupervised domain adaptation for semantic segmentation via class-balanced self-training,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 289–305.
- R. P. Poudel, S. Liwicki, and R. Cipolla, “Fast-scnn: Fast semantic segmentation network,” arXiv preprint arXiv:1902.04502, 2019.
- V. Guizilini, J. Li, R. Ambruș, and A. Gaidon, “Geometric unsupervised domain adaptation for semantic segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8537–8547.
- C. Sakaridis, D. Dai, and L. V. Gool, “Guided curriculum model adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 7374–7383.
- C. Sakaridis, D. Dai, and L. Van Gool, “Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
- X. Wu, Z. Wu, H. Guo, L. Ju, and S. Wang, “Dannet: A one-stage domain adaptation network for unsupervised nighttime semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15 769–15 778.
- J. Vertens, J. Zürn, and W. Burgard, “Heatnet: Bridging the day-night domain gap in semantic segmentation with thermal images,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 8461–8468.
- Y. Sun, W. Zuo, and M. Liu, “Rtfnet: Rgb-thermal fusion network for semantic segmentation of urban scenes,” IEEE Robotics and Automation Letters, vol. 4, no. 3, pp. 2576–2583, 2019.
- J. Zhang, K. Yang, and R. Stiefelhagen, “Issafe: Improving semantic segmentation in accidents by fusing event-based data,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 1132–1139.
- J. Cao, X. Zheng, Y. Lyu, J. Wang, R. Xu, and L. Wang, “Chasing day and night: Towards robust and efficient all-day object detection guided by an event camera,” arXiv preprint arXiv:2309.09297, 2023.
- M. Wulfmeier, A. Bewley, and I. Posner, “Addressing appearance change in outdoor robotics with adversarial domain adaptation,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2017, pp. 1551–1558.
- ——, “Incremental adversarial domain adaptation for continually changing environments,” in 2018 IEEE International conference on robotics and automation (ICRA). IEEE, 2018, pp. 4489–4495.
- L. Sun, K. Wang, K. Yang, and K. Xiang, “See clearer at night: towards robust nighttime semantic segmentation through day-night image conversion,” in Artificial Intelligence and Machine Learning in Defense Applications, vol. 11169. SPIE, 2019, pp. 77–89.
- H. Gao, J. Guo, G. Wang, and Q. Zhang, “Cross-domain correlation distillation for unsupervised domain adaptation in nighttime semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9913–9923.
- J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232.
- D. Dai and L. Van Gool, “Dark model adaptation: Semantic image segmentation from daytime to nighttime,” in 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2018, pp. 3819–3824.
- G. Yang, Z. Zhong, H. Tang, M. Ding, N. Sebe, and E. Ricci, “Bi-mix: Bidirectional mixing for domain adaptive nighttime semantic segmentation,” arXiv preprint arXiv:2111.10339, 2021.
- H. Lee, C. Han, and S.-W. Jung, “Gps-glass: Learning nighttime semantic segmentation using daytime video and gps data,” arXiv preprint arXiv:2207.13297, 2022.
- W. Liu, W. Li, J. Zhu, M. Cui, X. Xie, and L. Zhang, “Improving nighttime driving-scene segmentation via dual image-adaptive learnable filters,” arXiv preprint arXiv:2207.01331, 2022.
- F. Shen, Z. Pataki, A. Gurram, Z. Liu, H. Wang, and A. Knoll, “Loopda: Constructing self-loops to adapt nighttime semantic segmentation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 3256–3266.
- I. Alonso, A. Sabater, D. Ferstl, L. Montesano, and A. C. Murillo, “Semi-supervised semantic segmentation with pixel-level contrastive learning from a class-wise memory bank,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8219–8228.
- W. Liu, D. Ferstl, S. Schulter, L. Zebedin, P. Fua, and C. Leistner, “Domain adaptation for semantic segmentation via patch-wise contrastive learning,” arXiv preprint arXiv:2104.11056, 2021.
- C. Sakaridis, D. Dai, and L. Van Gool, “Acdc: The adverse conditions dataset with correspondences for semantic driving scene understanding,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10 765–10 775.
- J. Hoffman, D. Wang, F. Yu, and T. Darrell, “Fcns in the wild: Pixel-level adversarial and constraint-based adaptation,” arXiv preprint arXiv:1612.02649, 2016.
- Y.-H. Tsai, W.-C. Hung, S. Schulter, K. Sohn, M.-H. Yang, and M. Chandraker, “Learning to adapt structured output space for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7472–7481.
- J. He, X. Jia, S. Chen, and J. Liu, “Multi-source domain adaptation with collaborative learning for semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11 008–11 017.
- H. Ma, X. Lin, Z. Wu, and Y. Yu, “Coarse-to-fine domain adaptive semantic segmentation with photometric alignment and category-center regularization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 4051–4060.
- T. Isobe, X. Jia, S. Chen, J. He, Y. Shi, J. Liu, H. Lu, and S. Wang, “Multi-target domain adaptation with collaborative consistency learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8187–8196.
- W. Wang, T. Zhou, F. Yu, J. Dai, E. Konukoglu, and L. Van Gool, “Exploring cross-image pixel contrast for semantic segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 7303–7313.
- Y. Liu, W. Zhang, and J. Wang, “Source-free domain adaptation for semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1215–1224.
- E. Romera, L. M. Bergasa, K. Yang, J. M. Alvarez, and R. Barea, “Bridging the day and night domain gap for semantic segmentation,” in 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2019, pp. 1312–1318.
- R. Xia, C. Zhao, M. Zheng, Z. Wu, Q. Sun, and Y. Tang, “Cmda: Cross-modality domain adaptation for nighttime semantic segmentation,” arXiv preprint arXiv:2307.15942, 2023.
- X. Mao, Y. Ma, Z. Yang, Y. Chen, and Q. Li, “Virtual mixup training for unsupervised domain adaptation,” arXiv preprint arXiv:1905.04215, 2019.
- M. Xu, J. Zhang, B. Ni, T. Li, C. Wang, Q. Tian, and W. Zhang, “Adversarial domain adaptation with domain mixup,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 04, 2020, pp. 6502–6509.
- Y. Wu, D. Inkpen, and A. El-Roby, “Dual mixup regularized learning for adversarial domain adaptation,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16. Springer, 2020, pp. 540–555.
- S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “Cutmix: Regularization strategy to train strong classifiers with localizable features,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6023–6032.
- D. Walawalkar, Z. Shen, Z. Liu, and M. Savvides, “Attentive cutmix: An enhanced data augmentation approach for deep learning based image classification,” arXiv preprint arXiv:2003.13048, 2020.
- V. Olsson, W. Tranheden, J. Pinto, and L. Svensson, “Classmix: Segmentation-based data augmentation for semi-supervised learning,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 1369–1378.
- D. Bruggemann, C. Sakaridis, P. Truong, and L. Van Gool, “Refign: Align and refine for adaptation of semantic segmentation to adverse conditions,” arXiv preprint arXiv:2207.06825, 2022.
- G. Lin, A. Milan, C. Shen, and I. Reid, “Refinenet: Multi-path refinement networks for high-resolution semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1925–1934.
- B. Xie, S. Li, M. Li, C. H. Liu, G. Huang, and G. Wang, “Sepico: Semantic-guided pixel contrast for domain adaptive semantic segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- L. Hoyer, D. Dai, and L. Van Gool, “Hrda: Context-aware high-resolution domain-adaptive semantic segmentation,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXX. Springer, 2022, pp. 372–391.
- M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3213–3223.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017.
- L. Hoyer, D. Dai, and L. Van Gool, “Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9924–9935.