Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile (2403.05093v1)
Abstract: Currently, image generation and synthesis have remarkably progressed with generative models. Despite photo-realistic results, intrinsic discrepancies are still observed in the frequency domain. The spectral discrepancy appeared not only in generative adversarial networks but in diffusion models. In this study, we propose a framework to effectively mitigate the disparity in frequency domain of the generated images to improve generative performance of both GAN and diffusion models. This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning. We adopt theoretical logic of frequency components in various generative networks. The key idea, here, is to refine the spectrum of the generated image via the concept of image-to-image translation and contrastive learning in terms of digital signal processing. We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG. Our framework outperforms other cutting-edges showing significant decreases in FID and log frequency distance of spectrum. We further emphasize that STIG improves image quality by decreasing the spectral anomaly. Additionally, validation results present that the frequency-based deepfake detector confuses more in the case where fake spectrums are manipulated by STIG.
- Cantrell, C. D. 2000. Modern Mathematical Methods for Physicists and Engineers. USA: Cambridge University Press. ISBN 0521598273.
- A Simple Framework for Contrastive Learning of Visual Representations.
- SSD-GAN: Measuring the Realness in the Spatial and Spectral Domains. In AAAI.
- StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8789–8797.
- StarGAN v2: Diverse Image Synthesis for Multiple Domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
- Noiseprint: A CNN-Based Camera Model Fingerprint. IEEE Transactions on Information Forensics and Security, 15: 144–159.
- Think Twice Before Detecting GAN-generated Fake Images from their Spectral Domain Imprints. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7855–7864.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
- Watch Your Up-Convolution: CNN Based Generative Deep Neural Networks Are Failing to Reproduce Spectral Distributions. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7887–7896.
- Leveraging frequency analysis for deep fake image recognition. In International conference on machine learning, 3247–3258. PMLR.
- Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, 297–304. JMLR Workshop and Conference Proceedings.
- Dual contrastive learning for unsupervised image-to-image translation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 746–755.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30.
- Denoising Diffusion Probabilistic Models. In Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M.; and Lin, H., eds., Advances in Neural Information Processing Systems, volume 33, 6840–6851. Curran Associates, Inc.
- Image-to-Image Translation with Conditional Adversarial Networks. CVPR.
- FrePGAN: Robust Deepfake Detection Using Frequency-Level Perturbations. Proceedings of the AAAI Conference on Artificial Intelligence, 36(1): 1060–1068.
- Focal Frequency Loss for Image Reconstruction and Synthesis. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 13899–13909.
- Spectral Distribution Aware Image Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2): 1734–1742.
- A Style-Based Generator Architecture for Generative Adversarial Networks. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4396–4405.
- Analyzing and Improving the Image Quality of StyleGAN. In Proc. CVPR.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
- Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).
- Least Squares Generative Adversarial Networks. In 2017 IEEE International Conference on Computer Vision (ICCV), 2813–2821.
- Do gans leave artificial fingerprints? In 2019 IEEE conference on multimedia information processing and retrieval (MIPR), 506–511. IEEE.
- Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), 83–92.
- Detecting GAN-generated imagery using color cues. arXiv preprint arXiv:1812.08247.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741.
- Deconvolution and Checkerboard Artifacts. Distill.
- Contrastive Learning for Unpaired Image-to-Image Translation. In European Conference on Computer Vision.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, 8024–8035. Curran Associates, Inc.
- Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.
- Towards the Detection of Diffusion Model Deepfakes. arXiv:2210.14571.
- Generative modelling with inverse heat dissipation. arXiv preprint arXiv:2206.13397.
- Denoising Diffusion Implicit Models. arXiv:2010.02502.
- Score-based generative modeling through stochastic differential equations.
- Modelling the Power Spectra of Natural Images: Statistics and Information. Vision Research, 36(17): 2759–2770.
- CNN-Generated Images Are Surprisingly Easy to Spot… for Now. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8692–8701.
- Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600–612.
- Diffusion Probabilistic Model Made Slim. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 22552–22562.
- LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop. arXiv preprint arXiv:1506.03365.
- Face Morphing Detection Using Fourier Spectrum of Sensor Pattern Noise. In 2018 IEEE International Conference on Multimedia and Expo (ICME), 1–6.
- Detecting and Simulating Artifacts in GAN Fake Images. In WIFS.
- UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. IEEE Transactions on Medical Imaging.
- Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In 2017 IEEE International Conference on Computer Vision (ICCV), 2242–2251.