A Semi-Paired Approach For Label-to-Image Translation (2306.13585v2)
Abstract: Data efficiency, or the ability to generalize from a few labeled data, remains a major challenge in deep learning. Semi-supervised learning has thrived in traditional recognition tasks alleviating the need for large amounts of labeled data, yet it remains understudied in image-to-image translation (I2I) tasks. In this work, we introduce the first semi-supervised (semi-paired) framework for label-to-image translation, a challenging subtask of I2I which generates photorealistic images from semantic label maps. In the semi-paired setting, the model has access to a small set of paired data and a larger set of unpaired images and labels. Instead of using geometrical transformations as a pretext task like previous works, we leverage an input reconstruction task by exploiting the conditional discriminator on the paired data as a reverse generator. We propose a training algorithm for this shared network, and we present a rare classes sampling algorithm to focus on under-represented classes. Experiments on 3 standard benchmarks show that the proposed model outperforms state-of-the-art unsupervised and semi-supervised approaches, as well as some fully supervised approaches while using a much smaller number of paired samples.
- “Image-to-image translation with conditional adversarial networks” In Conference on Computer Vision and Pattern Recognition (CVPR), 2017
- “High-resolution image synthesis and semantic manipulation with conditional GANs” In Conference on Computer Vision and Pattern Recognition (CVPR), 2018
- “Generative Adversarial Nets” In NIPS, 2014
- L. A. Gatys, A. S. Ecker and M. Bethge “Image Style Transfer Using Convolutional Neural Networks” In Conference on Computer Vision and Pattern Recognition (CVPR), 2016
- “Photo-realistic single image super-resolution using a generative adversarial network” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4681–4690
- “Semantic image synthesis with spatially-adaptive normalization” In Conference on Computer Vision and Pattern Recognition (CVPR), 2019
- “Rethinking Spatially-Adaptive Normalization” In arXiv:2004.02867, 2020
- “Learning to predict layout-to-image conditional convolutions for semantic image synthesis” In Advances in Neural Information Processing Systems (NeurIPS), 2019
- “SEAN: Image Synthesis With Semantic Region-Adaptive Normalization” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 5103–5112
- “You Only Need Adversarial Supervision for Semantic Image Synthesis” In International Conference on Learning Representations, 2021
- “Semantic Palette: Guiding Scene Generation with Class Proportions” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 9342–9350
- “Urban-StyleGAN: Learning to Generate and Manipulate Images of Urban Scenes” In IV-Symposium, 2023
- “Towards Pragmatic Semantic Image Synthesis for Urban Scenes” In IV-Symposium, 2023
- “Unpaired image-to-image translation using cycle-consistent adversarial networks” In International Conference on Computer Vision (ICCV), 2017
- “Multimodal unsupervised image-to-image translation” In European Conference on Computer Vision (ECCV), 2018
- “Diverse Image-to-Image Translation via Disentangled Representation” In European Conference on Computer Vision (ECCV), 2018
- “Contrastive Learning for Unpaired Image-to-Image Translation” In European Conference on Computer Vision, 2020
- “Geometry-consistent generative adversarial networks for one-sided unsupervised domain mapping” In Conference on Computer Vision and Pattern Recognition (CVPR), 2019
- “One-sided unsupervised domain mapping” In Advances in Neural Information Processing Systems (NeurIPS), 2017
- Yaniv Taigman, Adam Polyak and Lior Wolf “Unsupervised Cross-Domain Image Generation” In International Conference on Learning Representations (ICLR), 2017
- “Learning from simulated and unsupervised images through adversarial training” In Conference on Computer Vision and Pattern Recognition (CVPR), 2017
- “Unsupervised pixel-level domain adaptation with generative adversarial networks” In Conference on Computer Vision and Pattern Recognition (CVPR), 2017
- “Travelgan: Image-to-image translation by transformation vector learning” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 8983–8992
- Rui Zhang, Tomas Pfister and Jia Li “Harmonic unpaired image-to-image translation” In International Conference on Learning Representations (ICLR), 2019
- “USIS: Unsupervised Semantic Image Synthesis” In Computers & Graphics, 2023 URL: https://www.sciencedirect.com/science/article/pii/S0097849323000018
- “Wavelet-Based Unsupervised Label-to-Image Translation” In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 1760–1764 DOI: 10.1109/ICASSP43922.2022.9746759
- Aamir Mustafa and Rafał K Mantiuk “Transformation consistency regularization–a semi-supervised paradigm for image-to-image translation” In European Conference on Computer Vision, 2020, pp. 599–615 Springer
- Jiaze Sun, Binod Bhattarai and Tae-Kyun Kim “MatchGAN: a self-supervised semi-supervised conditional generative adversarial network” In Proceedings of the Asian Conference on Computer Vision, 2020
- “Semi-supervised learning for few-shot image-to-image translation” In CVPR, 2020
- Manan Oza, Himanshu Vaghela and Sudhir Bagul “Semi-supervised image-to-image translation” In 2019 (ICAIIT)
- O. Ronneberger, P. Fischer and T. Brox “U-Net: Convolutional Networks for Biomedical Image Segmentation” In MICCAI, 2015
- “SWAGAN: A Style-based Wavelet-driven Generative Model” In ArXiv abs/2102.06108, 2021
- “Dualgan: Unsupervised dual learning for image-to-image translation” In International Conference on Computer Vision (ICCV), 2017
- “Analyzing and Improving the Image Quality of StyleGAN” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 8107–8116
- “The cityscapes dataset for semantic urban scene understanding” In Conference on Computer Vision and Pattern Recognition (CVPR), 2016
- “Scene parsing through ade20k dataset” In Conference on Computer Vision and Pattern Recognition (CVPR), 2017
- Holger Caesar, Jasper Uijlings and Vittorio Ferrari “Coco-stuff: Thing and stuff classes in context” In Conference on Computer Vision and Pattern Recognition (CVPR), 2018
- “GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium” In Advances in Neural Information Processing Systems (NeurIPS), 2017
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.