Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Locality-preserving Directions for Interpreting the Latent Space of Satellite Image GANs (2309.14883v1)

Published 26 Sep 2023 in cs.CV and cs.LG

Abstract: We present a locality-aware method for interpreting the latent space of wavelet-based Generative Adversarial Networks (GANs), that can well capture the large spatial and spectral variability that is characteristic to satellite imagery. By focusing on preserving locality, the proposed method is able to decompose the weight-space of pre-trained GANs and recover interpretable directions that correspond to high-level semantic concepts (such as urbanization, structure density, flora presence) - that can subsequently be used for guided synthesis of satellite imagery. In contrast to typically used approaches that focus on capturing the variability of the weight-space in a reduced dimensionality space (i.e., based on Principal Component Analysis, PCA), we show that preserving locality leads to vectors with different angles, that are more robust to artifacts and can better preserve class information. Via a set of quantitative and qualitative examples, we further show that the proposed approach can outperform both baseline geometric augmentations, as well as global, PCA-based approaches for data synthesis in the context of data augmentation for satellite scene classification.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger, Eds., vol. 27.   Curran Associates, Inc., 2014.
  2. T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” CoRR, 2017.
  3. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  4. T. e. a. Karras, “Analyzing and improving the image quality of stylegan,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  5. X. Gao and H. Xiong, “A hybrid wavelet convolution network with sparse-coding for image super-resolution,” in 2016 IEEE Int’l Conference on Image Processing (ICIP), 2016.
  6. T. Williams and R. Li, “Wavelet Pooling for Convolutional Neural Networks,” in Int’l conference on Learning Representations, 2018.
  7. Q. Zhang, H. Wang, T. Du, S. Yang, Y. Wang, Z. Xing, W. Bai, and Y. Yi, “Super-resolution reconstruction algorithms based on fusion of deep learning mechanism and wavelet,” in Proc. of the 2nd Int’l Conference on Artificial Intelligence and Pattern Recognition, 2019.
  8. G. e. a. Rinon, “Swagan: A style-based wavelet-driven generative model,” ACM Trans. Graph., vol. 40, 2021.
  9. N. Kostagiolas, M. Nicolaou, and Y. Panagakis, “Unsupervised discovery of semantic concepts in satellite imagery with style-based wavelet-driven generative models,” in Proceedings of the 12th Hellenic Conference on Artificial Intelligence.   ACM, sep 2022.
  10. D. Bau, J.-Y. Zhu, H. Strobelt, B. Zhou, J. B. Tenenbaum, W. T. Freeman, and A. Torralba, “Gan dissection: Visualizing and understanding generative adversarial networks,” preprint arXiv:1811.10597, 2018.
  11. A. Oliva, P. Isola et al., “Ganalyze: Toward visual definitions of cognitive image properties,” Journal of Vision, vol. 20, no. 11, 2020.
  12. A. Jahanian, L. Chai, and P. Isola, “On the ”steerability” of generative adversarial networks,” 2020.
  13. Y. Shen, J. Gu, X. Tang, and B. Zhou, “Interpreting the latent space of gans for semantic face editing,” 2020.
  14. E. Härkönen, A. Hertzmann, J. Lehtinen, and S. Paris, “Ganspace: Discovering interpretable gan controls,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33.   Curran Associates, Inc., 2020.
  15. A. Voynov and A. Babenko, “Unsupervised discovery of interpretable directions in the GAN latent space,” CoRR, vol. abs/2002.03754, 2020.
  16. Y. Shen and B. Zhou, “Closed-form factorization of latent semantics in gans,” in CVPR, 2021.
  17. W. Peebles, J. Peebles, J.-Y. Zhu, A. Efros, and A. Torralba, “The hessian penalty: A weak prior for unsupervised disentanglement,” 2020.
  18. Z. Wu, D. Lischinski, and E. Shechtman, “Stylespace analysis: Disentangled controls for stylegan image generation,” 2020.
  19. X. He and P. Niyogi, “Locality preserving projections,” in Advances in Neural Information Processing Systems, S. Thrun, L. Saul, and B. Schölkopf, Eds., vol. 16.   MIT Press, 2003.
  20. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  21. G. Cheng, J. Han, and X. Lu, “Remote sensing image scene classification: Benchmark and state of the art,” Proceedings of the IEEE, vol. 105, no. 10, 2017.
  22. G.-S. Xia, J. Hu, F. Hu, B. Shi, X. Bai, Y. Zhong, X. Lu, and L. Zhang, “Aid: A benchmark data set for performance evaluation of aerial scene classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, 02 2017.
  23. Y. Yang and S. Newsam, “Bag-of-visual-words and spatial extensions for land-use classification,” in Proceedings of the 18th SIGSPATIAL int’l conference on advances in geographic information systems, 2010.
Citations (1)

Summary

We haven't generated a summary for this paper yet.