What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs (2401.02411v1)
Abstract: 3D-aware Generative Adversarial Networks (GANs) have shown remarkable progress in learning to generate multi-view-consistent images and 3D geometries of scenes from collections of 2D images via neural volume rendering. Yet, the significant memory and computational costs of dense sampling in volume rendering have forced 3D GANs to adopt patch-based training or employ low-resolution rendering with post-processing 2D super resolution, which sacrifices multiview consistency and the quality of resolved geometry. Consequently, 3D GANs have not yet been able to fully resolve the rich 3D geometry present in 2D images. In this work, we propose techniques to scale neural volume rendering to the much higher resolution of native 2D images, thereby resolving fine-grained 3D geometry with unprecedented detail. Our approach employs learning-based samplers for accelerating neural rendering for 3D GAN training using up to 5 times fewer depth samples. This enables us to explicitly "render every pixel" of the full-resolution image during training and inference without post-processing superresolution in 2D. Together with our strategy to learn high-quality surface geometry, our method synthesizes high-resolution 3D geometry and strictly view-consistent images while maintaining image quality on par with baselines relying on post-processing super resolution. We demonstrate state-of-the-art 3D gemetric quality on FFHQ and AFHQ, setting a new standard for unsupervised learning of 3D shapes in 3D GANs.
- Coalition for content provenance and authenticity. https://c2pa.org/.
- Content authenticity initiative. https://contentauthenticity.org/.
- Panohead: Geometry-aware 3d full-head synthesis in 360deg. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Sal: Sign agnostic learning of shapes from raw data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2565–2574, 2020.
- Nerd: Neural reflectance decomposition from image collections. In IEEE International Conference on Computer Vision (ICCV), 2021.
- pi-GAN: Periodic implicit generative adversarial networks for 3D-aware image synthesis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Efficient geometry-aware 3D generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- GeNVS: Generative novel view synthesis with 3D-aware diffusion models. In IEEE International Conference on Computer Vision (ICCV), 2023.
- Fantasia3d: Disentangling geometry and appearance for high-quality text-to-3d content creation. arXiv preprint arXiv:2303.13873, 2023a.
- Mimic3d: Thriving 3d-aware gans via 3d-to-2d imitation. arXiv preprint arXiv:2303.09036, 2023b.
- Stargan v2: Diverse image synthesis for multiple domains. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- On the detection of synthetic images generated by diffusion models. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.
- Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5203–5212, 2020.
- Gram: Generative radiance manifolds for 3d-aware image generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Ag3d: Learning to generate 3d avatars from 2d image collections. arXiv preprint arXiv:2305.02312, 2023.
- K-planes: Explicit radiance fields in space, time, and appearance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12479–12488, 2023.
- Learning neural parametric head models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21003–21012, 2023.
- Generative adversarial nets. In Advances in Neural Information Processing Systems (NeurIPS), 2014.
- StyleNeRF: A style-based 3D-aware generator for high-resolution image synthesis. arXiv preprint arXiv:2110.08985, 2021.
- Nerfdiff: Single-image view synthesis with nerf-guided distillation from 3d-aware diffusion. In International Conference on Machine Learning, pages 11808–11826. PMLR, 2023.
- Mcnerf: Monte carlo rendering and denoising for real-time nerfs. In ACM SIGGRAPH Asia 2023 Conference Proceedings, 2023.
- GANs trained by a two time-scale update rule converge to a nash equilibrium. In Advances in Neural Information Processing Systems (NeurIPS), 2017.
- The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751, 2019.
- Holodiffusion: Training a 3D diffusion model using 2D images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- A style-based generator architecture for generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Training generative adversarial networks with limited data. In Advances in Neural Information Processing Systems (NeurIPS), 2020a.
- Analyzing and improving the image quality of StyleGAN. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020b.
- Alias-free generative adversarial networks. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 2023.
- Adanerf: Adaptive sampling for real-time rendering of neural radiance fields. 2022.
- Nerfacc: Efficient sampling accelerates nerfs. arXiv preprint arXiv:2305.04966, 2023a.
- Neuralangelo: High-fidelity neural surface reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023b.
- Magic3d: High-resolution text-to-3d content creation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 300–309, 2023.
- Efficient neural radiance fields for interactive free-viewpoint video. In ACM Transactions on Graphics (SIGGRAPH ASIA), 2022.
- AutoInt: Automatic integration for fast neural volume rendering. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- N. Max. Optical models for direct volume rendering. IEEE Transactions on Visualization and Computer Graphics (TVCG), 1995.
- Which training methods for gans do actually converge? In International conference on machine learning, pages 3481–3490. PMLR, 2018.
- NeRF: Representing scenes as neural radiance fields for view synthesis. In European Conference on Computer Vision (ECCV), 2020.
- Don P Mitchell. Consequences of stratified sampling in graphics. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pages 277–280, 1996.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
- DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks. Computer Graphics Forum, 2021.
- Point-e: A system for generating 3d point clouds from complex prompts, 2022.
- GIRAFFE: Representing scenes as compositional generative neural feature fields. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- UNISURF: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In IEEE International Conference on Computer Vision (ICCV), 2021.
- StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Terminerf: Ray termination prediction for efficient neural rendering. In International Conference on 3D Vision (3DV), 2021.
- Dreamfusion: Text-to-3d using 2d diffusion. International Conference on Learning Representations (ICLR), 2022.
- Avatar fingerprinting for authorized use of synthetic talking-head videos. arXiv preprint arXiv:2305.03713, 2023.
- GRAF: Generative radiance fields for 3D-aware image synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Voxgraf: Fast 3d-aware image synthesis with sparse voxel grids. Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Implicit neural representations with periodic activation functions. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Light field networks: Neural scene representations with single-evaluation rendering. Advances in Neural Information Processing Systems, 34:19313–19325, 2021.
- Epigraf: Rethinking training of 3d gans. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
- 3d generation on imagenet. arXiv preprint arXiv:2303.01416, 2023.
- Viewset diffusion: (0-)image-conditioned 3D generative models from 2D data. In ICCV, 2023.
- Neural geometric level of detail: Real-time rendering with implicit 3D shapes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Compact neural graphics primitives with learned hash probing. In ACM SIGGRAPH Asia 2023 Conference Proceedings, 2023.
- Diffusion with forward models: Solving stochastic inverse problems without direct supervision. Advances in Neural Information Processing Systems (NeurIPS), 2023.
- Real-time radiance fields for single-image portrait view synthesis. In ACM Transactions on Graphics (SIGGRAPH), 2023.
- Ref-NeRF: Structured view-dependent appearance for neural radiance fields. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12619–12629, 2023a.
- Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. Advances in Neural Information Processing Systems (NeurIPS), 2021.
- Rodin: A generative model for sculpting 3d digital avatars using diffusion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023b.
- Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction. In IEEE International Conference on Computer Vision (ICCV), 2023c.
- Adaptive shells for efficient neural radiance field rendering. In ACM Transactions on Graphics (SIGGRAPH ASIA).
- Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. arXiv preprint arXiv:2305.16213, 2023d.
- Gram-hd: 3d-consistent image generation at high resolution with generative radiance manifolds. arXiv preprint arXiv:2206.07255, 2022.
- Neural fields in visual computing and beyond. In Computer Graphics Forum. Wiley Online Library, 2022.
- 3d-aware image synthesis via learning structural and textural representations. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Giraffe hd: A high-resolution 3d-aware generative model. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Multiview neural surface reconstruction by disentangling geometry and appearance. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Volume rendering of neural implicit surfaces. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
- PlenOctrees for real-time rendering of neural radiance fields. In IEEE International Conference on Computer Vision (ICCV), 2021.
- Multi-view consistent generative adversarial networks for 3d-aware image synthesis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Generative multiplane images: Making a 2d gan 3d-aware. In European Conference on Computer Vision (ECCV), 2022.
- CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis. arXiv preprint arXiv:2110.09788, 2021.
- Alex Trevithick (8 papers)
- Matthew Chan (7 papers)
- Towaki Takikawa (13 papers)
- Umar Iqbal (50 papers)
- Shalini De Mello (45 papers)
- Manmohan Chandraker (108 papers)
- Ravi Ramamoorthi (65 papers)
- Koki Nagano (27 papers)