Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 42 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 217 tok/s Pro
GPT OSS 120B 474 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis (2401.03764v1)

Published 8 Jan 2024 in cs.CV and cs.GR

Abstract: Existing 3D-aware portrait synthesis methods can generate impressive high-quality images while preserving strong 3D consistency. However, most of them cannot support the fine-grained part-level control over synthesized images. Conversely, some GAN-based 2D portrait synthesis methods can achieve clear disentanglement of facial regions, but they cannot preserve view consistency due to a lack of 3D modeling abilities. To address these issues, we propose 3D-SSGAN, a novel framework for 3D-aware compositional portrait image synthesis. First, a simple yet effective depth-guided 2D-to-3D lifting module maps the generated 2D part features and semantics to 3D. Then, a volume renderer with a novel 3D-aware semantic mask renderer is utilized to produce the composed face features and corresponding masks. The whole framework is trained end-to-end by discriminating between real and synthesized 2D images and their semantic masks. Quantitative and qualitative evaluations demonstrate the superiority of 3D-SSGAN in controllable part-level synthesis while preserving 3D view consistency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019, p. 4401–4410.
  2. Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020, p. 8110–8119.
  3. Alias-free generative adversarial networks. Advances in Neural Information Processing Systems 2021;34:852–863.
  4. State-of-the-art in the architecture, methods and applications of stylegan. Computer Graphics Forum 2022;41(2):591–611.
  5. Semanticstylegan: Learning compositional generative priors for controllable image synthesis and editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 11254–11264.
  6. Stylenerf: A style-based 3d-aware generator for high-resolution image synthesis. arXiv preprint arXiv:211008985 2021;.
  7. 3d-aware image synthesis via learning structural and textural representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022a, p. 18430–18439.
  8. Efficient geometry-aware 3d generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 16123–16133.
  9. Gram: Generative radiance manifolds for 3d-aware image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 10673–10683.
  10. Stylesdf: High-resolution 3d-consistent image and geometry generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 13503–13513.
  11. Generative adversarial nets. stat 2014;1050:10.
  12. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM 2021;65(1):99–106.
  13. Ide-3d: Interactive disentangled editing for high-resolution 3d-aware portrait synthesis. arXiv preprint arXiv:220515517 2022a;.
  14. Nerffaceediting: Disentangled face editing in neural radiance fields. In: SIGGRAPH Asia 2022 Conference Papers. 2022, p. 1–9.
  15. Semantic 3d-aware portrait synthesis and manipulation based on compositional neural radiance field. In: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI). 2023,.
  16. Advances in neural rendering. Computer Graphics Forum 2022;41(2):703–735.
  17. A survey on 3d-aware image synthesis. 2022. arXiv:2210.14267.
  18. Visual object networks: Image generation with disentangled 3d representations. Advances in neural information processing systems 2018;31.
  19. Escaping plato’s cave: 3d shape from adversarial rendering. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, p. 9984--9993.
  20. Hologan: Unsupervised learning of 3d representations from natural images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, p. 7588--7597.
  21. Blockgan: Learning 3d object-aware scene representations from unlabelled images. Advances in Neural Information Processing Systems 2020;33:6767--6778.
  22. pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021, p. 5799--5809.
  23. A shading-guided generative implicit model for shape-accurate 3d-aware image synthesis. Advances in Neural Information Processing Systems 2021;34:20002--20013.
  24. Cips-3d: A 3d-aware generator of gans based on conditionally-independent pixel synthesis. arXiv preprint arXiv:211009788 2021;.
  25. Gram-hd: 3d-consistent image generation at high resolution with generative radiance manifolds. arXiv preprint arXiv:220607255 2022;.
  26. Exploiting spatial dimensions of latent in gan for real-time image editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, p. 852--861.
  27. Diagonal attention and style-based gan for content-style disentanglement in image generation and translation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, p. 13980--13989.
  28. Transeditor: transformer-based dual-space gan for highly controllable facial editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022b, p. 7683--7692.
  29. Stylefusion: A generative model for disentangling spatial segments. arXiv preprint arXiv:210707437 2021;.
  30. Fenerf: Face editing in neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022b, p. 7672--7682.
  31. Sem2nerf: Converting single-view semantic masks to neural radiance fields. arXiv preprint arXiv:220310821 2022;.
  32. Training and tuning generative neural radiance fields for attribute-conditional 3d-aware face generation. arXiv preprint arXiv:220812550 2022;.
  33. Compositional gan: Learning image-conditional binary composition. International Journal of Computer Vision 2020;128:2570--2585.
  34. Surprising image compositions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, p. 3926--3930.
  35. Sac-gan: Structure-aware image composition. IEEE Transactions on Visualization and Computer Graphics 2022;:1--13.
  36. Attend, infer, repeat: Fast scene understanding with generative models. Advances in neural information processing systems 2016;29.
  37. Lr-gan: Layered recursive generative adversarial networks for image generation. arXiv preprint arXiv:170301560 2017;.
  38. Monet: Unsupervised scene decomposition and representation. arXiv preprint arXiv:190111390 2019;.
  39. Multi-object representation learning with iterative variational inference. In: International Conference on Machine Learning. PMLR; 2019, p. 2424--2433.
  40. Relate: Physically plausible multi-object scene synthesis using structured latent spaces. Advances in Neural Information Processing Systems 2020;33:11202--11213.
  41. Compositional transformers for scene generation. Advances in Neural Information Processing Systems 2021;34:9506--9520.
  42. Giraffe: Representing scenes as compositional generative neural feature fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, p. 11453--11464.
  43. Giraffe hd: A high-resolution 3d-aware generative model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 18440--18449.
  44. Maskgan: Towards diverse and interactive facial image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, p. 5549--5558.
  45. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 2017;30.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube