Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GaussianPro: 3D Gaussian Splatting with Progressive Propagation (2402.14650v1)

Published 22 Feb 2024 in cs.CV

Abstract: The advent of 3D Gaussian Splatting (3DGS) has recently brought about a revolution in the field of neural rendering, facilitating high-quality renderings at real-time speed. However, 3DGS heavily depends on the initialized point cloud produced by Structure-from-Motion (SfM) techniques. When tackling with large-scale scenes that unavoidably contain texture-less surfaces, the SfM techniques always fail to produce enough points in these surfaces and cannot provide good initialization for 3DGS. As a result, 3DGS suffers from difficult optimization and low-quality renderings. In this paper, inspired by classical multi-view stereo (MVS) techniques, we propose GaussianPro, a novel method that applies a progressive propagation strategy to guide the densification of the 3D Gaussians. Compared to the simple split and clone strategies used in 3DGS, our method leverages the priors of the existing reconstructed geometries of the scene and patch matching techniques to produce new Gaussians with accurate positions and orientations. Experiments on both large-scale and small-scale scenes validate the effectiveness of our method, where our method significantly surpasses 3DGS on the Waymo dataset, exhibiting an improvement of 1.15dB in terms of PSNR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph., 28(3):24, 2009.
  2. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  5855–5864, 2021.
  3. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  5470–5479, 2022.
  4. Zip-nerf: Anti-aliased grid-based neural radiance fields. Proceedings of the IEEE International Conference on Computer Vision, 2023.
  5. Patchmatch stereo-stereo matching with slanted support windows. In Bmvc, volume 11, pp.  1–11, 2011.
  6. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  11621–11631, 2020.
  7. Using multiple hypotheses to improve depth-maps for multi-view stereo. In Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part I 10, pp.  766–779. Springer, 2008.
  8. Tensorf: Tensorial radiance fields. In European Conference on Computer Vision, pp.  333–350. Springer, 2022.
  9. Neusg: Neural implicit surface reconstruction with 3d gaussian splatting guidance. arXiv preprint arXiv:2312.00846, 2023.
  10. Point-based multi-view stereo network. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  1538–1547, 2019.
  11. Uc-nerf: Neural radiance field for under-calibrated multi-view cameras. In The Twelfth International Conference on Learning Representations, 2023.
  12. Depth-supervised nerf: Fewer views and faster training for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  12882–12891, 2022a.
  13. Fov-nerf: Foveated neural radiance fields for virtual reality. IEEE Transactions on Visualization and Computer Graphics, 28(11):3854–3864, 2022b.
  14. Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps. arXiv preprint arXiv:2311.17245, 2023.
  15. Cvrecon: Rethinking 3d geometric feature learning for neural reconstruction. arXiv preprint arXiv:2304.14633, 2023.
  16. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  5501–5510, 2022.
  17. Accurate, dense, and robust multiview stereopsis. IEEE transactions on pattern analysis and machine intelligence, 32(8):1362–1376, 2009.
  18. Multi-view stereo: A tutorial. Foundations and Trends® in Computer Graphics and Vision, 9(1-2):1–148, 2015.
  19. Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces. arXiv preprint arXiv:2311.17977, 2023.
  20. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
  21. Compact 3d gaussian representation for radiance field. arXiv preprint arXiv:2311.13681, 2023.
  22. Neural sparse voxel fields. Advances in Neural Information Processing Systems, 33:15651–15663, 2020.
  23. Occlusion-aware depth estimation with adaptive normal constraints. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX 16, pp.  640–657. Springer, 2020.
  24. Multi-view depth estimation using epipolar spatio-temporal networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  8258–8267, 2021.
  25. Sparseneus: Fast generalizable neural surface reconstruction from sparse views. In European Conference on Computer Vision, pp.  210–227. Springer, 2022.
  26. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering. arXiv preprint arXiv:2312.00109, 2023.
  27. Multiview stereo with cascaded epipolar raft. In European Conference on Computer Vision, pp.  734–750. Springer, 2022.
  28. Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision, 2020.
  29. Compact 3d scene representation via self-organizing gaussian grids. arXiv preprint arXiv:2312.13299, 2023.
  30. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
  31. Compact3d: Compressing gaussian splat radiance field models with vector quantization. arXiv preprint arXiv:2311.18159, 2023.
  32. Dreamfusion: Text-to-3d using 2d diffusion. In The Eleventh International Conference on Learning Representations, 2022.
  33. Pixelwise view selection for unstructured multi-view stereo. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14, pp.  501–518. Springer, 2016.
  34. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  2446–2454, 2020.
  35. Dreamgaussian: Generative gaussian splatting for efficient 3d content creation. arXiv preprint arXiv:2309.16653, 2023.
  36. Atlasnet: Multi-atlas non-linear deep networks for medical image segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part IV 11, pp.  658–666. Springer, 2018.
  37. F2-nerf: Fast neural radiance field training with free camera trajectories. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  4150–4159, 2023.
  38. Multi-scale geometric consistency guided multi-view stereo. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  5483–5492, 2019.
  39. Point-nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  5438–5448, 2022.
  40. Multi-scale 3d gaussian splatting for anti-aliased rendering. arXiv preprint arXiv:2311.17089, 2023.
  41. Unisim: A neural closed-loop sensor simulator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  1389–1399, 2023.
  42. Mvsnet: Depth inference for unstructured multi-view stereo. In Proceedings of the European conference on computer vision (ECCV), pp.  767–783, 2018.
  43. Fast normalized cross-correlation. Circuits, systems and signal processing, 28:819–843, 2009.
  44. Monosdf: Exploring monocular geometric cues for neural implicit surface reconstruction. Advances in neural information processing systems, 35:25018–25032, 2022.
  45. Mip-splatting: Alias-free 3d gaussian splatting. arXiv preprint arXiv:2311.16493, 2023.
  46. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  586–595, 2018.
  47. Ewa splatting. IEEE Transactions on Visualization and Computer Graphics, 8(3):223–238, 2002.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Kai Cheng (38 papers)
  2. Xiaoxiao Long (47 papers)
  3. Kaizhi Yang (5 papers)
  4. Yao Yao (235 papers)
  5. Wei Yin (57 papers)
  6. Yuexin Ma (97 papers)
  7. Wenping Wang (184 papers)
  8. Xuejin Chen (29 papers)
Citations (66)

Summary

  • The paper introduces a progressive propagation strategy that densifies 3D Gaussians using 2D view-dependent depth and normal maps.
  • It significantly improves rendering quality, achieving a 1.15dB PSNR gain on the Waymo dataset and robust performance on MipNeRF360.
  • The incorporation of a planar constraint refines geometric accuracy by ensuring consistency between rendered and propagated normals.

Enhancing 3D Gaussian Splatting with Progressive Propagation for Neural Rendering

Introduction to GaussianPro

In the quest for real-time neural rendering, the advent of 3D Gaussian Splatting (3DGS) represented a significant leap forward, thanks to its efficiency and the quality of the renderings it could produce. However, it's evident that the initial excitement around 3DGS was tempered by its dependency on sparse Structure-from-Motion (SfM) point clouds for initialization. This reliance introduces notable limitations, particularly in large-scale scenes featuring texture-less surfaces, where SfM techniques struggle to generate sufficient points for effective 3DGS initialization. Addressing these challenges, our work introduces GaussianPro, a novel approach that enhances the densification of 3D Gaussians by leveraging a progressive propagation strategy. Our method bridges the gap between classical multi-view stereo (MVS) techniques and modern neural rendering, significantly improving rendering quality.

Methodology

Hybrid Geometric Representation

We approached the challenge by combining 3D Gaussians with 2D view-dependent depth and normal maps. This hybrid representation allows us to leverage 2D image space, facilitating the efficient determination of neighboring Gaussians and propagating geometric information among them. Our method operates by projecting 3D Gaussians onto 2D space to generate depth and normal maps. These maps guide the growth of Gaussians by informed densification.

Progressive Gaussian Propagation

Central to our approach is the progressive Gaussian propagation strategy. This technique utilizes patch matching to propagate depth and normal information from neighboring pixels, generating new, more accurate Gaussians. This method not only addresses the challenges posed by texture-less regions but also compensates for the limitations of sparse SfM point clouds. Furthermore, by employing geometric filtering and selection, we refine the propagation results, ensuring that new Gaussians are generated where necessary to model the scene accurately.

Experimental Evaluations

Our experiments, conducted on both large-scale and small-scale scenes, demonstrate the effectiveness of our method. On the Waymo dataset, GaussianPro surpassed the rendering quality of 3DGS, achieving a significant improvement of 1.15dB in PSNR. Similarly, on the MipNeRF360 dataset, our method showcased its robustness and adaptability, delivering comparable or superior performance to 3DGS across various metrics.

Integration of Planar Constraint

An innovating aspect of our methodology is the introduction of a planar constraint during the optimization process. This constraint enforces consistency between the Gaussian's rendered normal and the propagated normal, thus improving the geometrical accuracy of the Gaussians. The results underscore the capacity of our method to generate more compact and accurate Gaussians, leading to enhanced rendering quality, particularly in scenes with prevalent planar surfaces.

Conclusion and Future Work

GaussianPro represents a substantial step forward in neural rendering, addressing the critical issues of 3D Gaussian densification and optimization in texture-less regions. By integrating insights from classical MVS and exploiting the strength of modern neural rendering techniques, our method not only enhances the visual quality of renderings but also maintains computational efficiency. While our current focus has been on static scenes, future developments could extend this approach to dynamic objects, offering a comprehensive solution for real-time, high-quality neural rendering across a broader spectrum of applications.

Youtube Logo Streamline Icon: https://streamlinehq.com