Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

3D Reconstruction with Fast Dipole Sums (2405.16788v4)

Published 27 May 2024 in cs.CV and cs.GR

Abstract: We introduce a method for high-quality 3D reconstruction from multi-view images. Our method uses a new point-based representation, the regularized dipole sum, which generalizes the winding number to allow for interpolation of per-point attributes in point clouds with noisy or outlier points. Using regularized dipole sums, we represent implicit geometry and radiance fields as per-point attributes of a dense point cloud, which we initialize from structure from motion. We additionally derive Barnes-Hut fast summation schemes for accelerated forward and adjoint dipole sum queries. These queries facilitate the use of ray tracing to efficiently and differentiably render images with our point-based representations, and thus update their point attributes to optimize scene geometry and appearance. We evaluate our method in inverse rendering applications against state-of-the-art alternatives, based on ray tracing of neural representations or rasterization of Gaussian point-based representations. Our method significantly improves 3D reconstruction quality and robustness at equal runtimes, while also supporting more general rendering methods such as shadow rays for direct illumination.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (90)
  1. Large-Scale Data for Multiple-View Stereopsis. International Journal of Computer Vision (2016), 1–16.
  2. Building rome in a day. Commun. ACM 54, 10 (2011), 105–112.
  3. Differentiable rendering of neural sdfs through reparameterization. In SIGGRAPH Asia 2022 Conference Papers. 1–9.
  4. Fast winding numbers for soups and clouds. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–12.
  5. Josh Barnes and Piet Hut. 1986. A hierarchical O (N log N) force-calculation algorithm. nature 324, 6096 (1986), 446–449.
  6. A simple method for computing singular or nearly singular integrals on closed surfaces. Communications in Computational Physics 20, 3 (2016), 733–753.
  7. A short course on fast multipole methods. Wavelets, multilevel methods and elliptic PDEs 1 (1997), 1–37.
  8. Signed Lp-distance fields. Computer-Aided Design 45, 2 (2013), 523–528.
  9. State of the art in surface reconstruction from point clouds. In 35th Annual Conference of the European Association for Computer Graphics, Eurographics 2014-State of the Art Reports. The Eurographics Association.
  10. Neural reflectance fields for appearance acquisition. arXiv preprint arXiv:2008.03824 (2020).
  11. Deep reflectance volumes: Relightable reconstructions from multi-view photometric images. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer, 294–311.
  12. Spatiotemporal reservoir resampling for real-time ray tracing with dynamic direct lighting. ACM Transactions on Graphics (TOG) 39, 4 (2020), 148–1.
  13. Physics-based inverse rendering using combined implicit and explicit geometries. In Computer Graphics Forum, Vol. 41. Wiley Online Library, 129–138.
  14. Reconstruction and representation of 3D objects with radial basis functions. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. 67–76.
  15. Neurbf: A neural fields representation with adaptive radial basis functions. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4182–4194.
  16. Ricardo Cortez. 2001. The method of regularized Stokeslets. SIAM Journal on Scientific Computing 23, 4 (2001), 1204–1225.
  17. The method of regularized Stokeslets in three dimensions: analysis, validation, and application to helical swimming. Physics of Fluids 17, 3 (2005).
  18. Bundlefusion: Real-time globally consistent 3d reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics (ToG) 36, 4 (2017), 1.
  19. High-quality Surface Reconstruction using Gaussian Surfels. In SIGGRAPH 2024 Conference Papers. Association for Computing Machinery. https://doi.org/10.1145/3641519.3657441
  20. Depth-supervised nerf: Fewer views and faster training for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12882–12891.
  21. Winding Numbers on Discrete Surfaces. ACM Transactions on Graphics (TOG) (2023).
  22. Mean value coordinates in 3D. Computer Aided Geometric Design 22, 7 (2005), 623–631.
  23. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5501–5510.
  24. Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view Reconstruction. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.). https://openreview.net/forum?id=JvIFpZOjLF4
  25. Simon Fuhrmann and Michael Goesele. 2014. Floating scale surface reconstruction. ACM Transactions on Graphics (ToG) 33, 4 (2014), 1–11.
  26. Ray Tracing Harmonic Functions. ACM Trans. Graph. 43, 4 (2024).
  27. Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 315–323.
  28. Antoine Guédon and Vincent Lepetit. 2023. SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering. arXiv preprint arXiv:2311.12775 (2023).
  29. Jun Han and Claudio Moraga. 1995. The influence of the sigmoid function parameters on the speed of backpropagation learning. In International workshop on artificial neural networks. Springer, 195–201.
  30. John C Hart. 1996. Sphere tracing: A geometric method for the antialiased ray tracing of implicit surfaces. The Visual Computer 12, 10 (1996), 527–545.
  31. Richard Hartley and Andrew Zisserman. 2003. Multiple view geometry in computer vision. Cambridge university press.
  32. Surface reconstruction from unorganized points. In Proceedings of the 19th annual conference on computer graphics and interactive techniques. 71–78.
  33. Kai Hormann and N Sukumar. 2017. Generalized barycentric coordinates in computer graphics and computational mechanics. CRC press.
  34. Robust inside-outside segmentation using generalized winding numbers. ACM Transactions on Graphics (TOG) 32, 4 (2013), 1–12.
  35. libigl: A simple C++ geometry processing library. https://libigl.github.io/.
  36. Mean value coordinates for closed triangular meshes. ACM Trans. Graph. 24, 3 (jul 2005), 561–566. https://doi.org/10.1145/1073204.1073229
  37. Relu fields: The little non-linearity that could. In ACM SIGGRAPH 2022 Conference Proceedings. 1–9.
  38. Poisson surface reconstruction. In Proceedings of the fourth Eurographics symposium on Geometry processing, Vol. 7. 0.
  39. Michael Kazhdan and Hugues Hoppe. 2013. Screened poisson surface reconstruction. ACM Transactions on Graphics (ToG) 32, 3 (2013), 1–13.
  40. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics 42, 4 (2023), 1–14.
  41. Infonerf: Ray entropy minimization for few-shot neural volume rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12912–12921.
  42. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. http://arxiv.org/abs/1412.6980
  43. Pavel A Krutitskii. 2001. The jump problem for the Laplace equation. Applied Mathematics Letters 14, 3 (2001), 353–358.
  44. Shading-aware multi-view stereo. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14. Springer, 469–485.
  45. Neuralangelo: High-Fidelity Neural Surface Reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  46. Shadowneus: Neural sdf reconstruction by shadow ray supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 175–185.
  47. Learning smooth neural functions via lipschitz regularization. In ACM SIGGRAPH 2022 Conference Proceedings. 1–13.
  48. Matthew M Loper and Michael J Black. 2014. OpenDR: An approximate differentiable renderer. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII 13. Springer, 154–169.
  49. William E. Lorensen and Harvey E. Cline. 1987. Marching Cubes: A High Resolution 3D Surface Construction Algorithm. In Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’87). 163–169. https://doi.org/10.1145/37401.37422
  50. Unified shape and svbrdf recovery using differentiable monte carlo rendering. In Computer Graphics Forum, Vol. 40. Wiley Online Library, 101–113.
  51. Kanti V Mardia and Peter E Jupp. 2009. Directional statistics. John Wiley & Sons.
  52. Stephen Robert Marschner. 1998. Inverse rendering for computer graphics. Cornell University.
  53. Nelson Max. 1995. Optical models for direct volume rendering. IEEE Transactions on Visualization and Computer Graphics 1, 2 (1995), 99–108.
  54. Donald Meagher. 1982. Geometric modeling using octree encoding. Computer graphics and image processing 19, 2 (1982), 129–147.
  55. NeRF: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99–106.
  56. Objects as volumes: A stochastic geometry view of opaque solids. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  57. Instant neural graphics primitives with a multiresolution hash encoding. ACM transactions on graphics (TOG) 41, 4 (2022), 1–15.
  58. OpenVDB: an open-source data structure and toolkit for high-resolution volumes. In Acm siggraph 2013 courses. 1–1.
  59. Radiative backpropagation: An adjoint method for lightning-fast differentiable rendering. ACM Transactions on Graphics (TOG) 39, 4 (2020), 146–1.
  60. Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5589–5599.
  61. A survey of structure from motion*. Acta Numerica 26 (2017), 305–364.
  62. Automatic differentiation in pytorch. (2017).
  63. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
  64. Shape as points: A differentiable poisson solver. Advances in Neural Information Processing Systems 34 (2021), 13032–13044.
  65. Physically based rendering: From theory to implementation. MIT Press.
  66. The fast kernel transform. In International Conference on Artificial Intelligence and Statistics. PMLR, 11669–11690.
  67. Tim Salimans and Durk P Kingma. 2016. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. Advances in neural information processing systems 29 (2016).
  68. Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In Conference on Computer Vision and Pattern Recognition (CVPR).
  69. Photo tourism: exploring photo collections in 3D. In ACM siggraph 2006 papers. 835–846.
  70. Modeling the world from internet photo collections. International journal of computer vision 80 (2008), 189–210.
  71. Jos Stam. 2020. Computing Light Transport Gradients using the Adjoint Method. arXiv preprint arXiv:2006.15059 (2020).
  72. Advances in neural rendering. In Computer Graphics Forum, Vol. 41. Wiley Online Library, 703–735.
  73. Carlo Tomasi and Takeo Kanade. 1990. Shape and motion without depth. In Proceedings of the DARPA Image Understanding Workshop. 258.
  74. Shimon Ullman. 1979. The interpretation of structure from motion. Proceedings of the Royal Society of London. Series B. Biological Sciences 203, 1153 (1979), 405–426.
  75. Ref-NeRF: Structured view-dependent appearance for neural radiance fields. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 5481–5490.
  76. Path replay backpropagation: Differentiating light paths using constant memory and linear time. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1–14.
  77. Differentiable signed distance function rendering. ACM Transactions on Graphics (TOG) 41, 4 (2022), 1–18.
  78. Embree: a kernel framework for efficient CPU ray tracing. ACM Transactions on Graphics (TOG) 33, 4 (2014), 1–8.
  79. NeuS codebase. https://github.com/Totoro97/NeuS.
  80. NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. Advances in Neural Information Processing Systems 34 (2021).
  81. NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
  82. High-quality shape from multi-view stereo and shading under general illumination. In CVPR 2011. IEEE, 969–976.
  83. Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction. In International Conference on Learning Representations (ICLR).
  84. Point-nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5438–5448.
  85. Globally consistent normal orientation for point clouds by regularizing the winding-number field. ACM Transactions on Graphics (TOG) 42, 4 (2023), 1–15.
  86. BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks. Computer Vision and Pattern Recognition (CVPR) (2020).
  87. Volume rendering of neural implicit surfaces. Advances in Neural Information Processing Systems 34 (2021), 4805–4815.
  88. Lyubomir G Zagorchev and Arthur Ardeshir Goshtasby. 2011. A curvature-adaptive implicit surface reconstruction for irregularly spaced points. IEEE Transactions on Visualization and Computer Graphics 18, 9 (2011), 1460–1473.
  89. Shading-based refinement on volumetric signed distance functions. ACM Transactions on Graphics (ToG) 34, 4 (2015), 1–14.
  90. EWA splatting. IEEE Transactions on Visualization and Computer Graphics 8, 3 (2002), 223–238.
Citations (1)

Summary

  • The paper presents a novel dipole sum representation that generalizes the winding number for handling noisy point clouds, resulting in robust geometry reconstruction.
  • It introduces Barnes-Hut fast summation to accelerate dipole sum queries and enables efficient, differentiable inverse rendering via ray tracing.
  • Evaluation on datasets like DTU and BlendedMVS demonstrates superior reconstruction quality and comparable runtimes to state-of-the-art techniques.

3D Reconstruction with Fast Dipole Sums

The paper "3D Reconstruction with Fast Dipole Sums" introduces a novel technique for reconstructing high-fidelity surfaces from multi-view images. Authored by Hanyu Chen, Bailey Miller, and Ioannis Gkioulekas from Carnegie Mellon University, this research leverages a newly proposed point-based representation called the dipole sum. This representation extends the concept of the winding number, allowing for the interpolation of arbitrary per-point attributes even in the presence of noisy or outlier points within point clouds.

Overview

The proposed technique is particularly effective for inverse rendering applications where scene geometry and radiance fields are represented as per-point attributes within a point cloud. The process begins with initializing these attributes via structure from motion (SfM). The authors further enhance the computational efficiency by deriving Barnes-Hut fast summation schemes for accelerated dipole sum queries, facilitating efficient and differentiable rendering using ray tracing. This accelerated querying system enables fast optimization of scene geometry and appearance, resulting in significant improvements in reconstruction quality while maintaining equal runtimes compared to state-of-the-art alternatives.

Contributions and Results

Geometry and Radiance Field Representation

  1. Geometry Field: The dipole sum representation generalizes the winding number by introducing regularized kernels and general per-point attributes, enabling it to handle noisy or outlier point clouds from SfM. The geometry field σ\sigma is represented as a regularized dipole sum:

σ(x)=mαm4πnm(xxm)xxm3,\sigma(\mathbf{x}) = \sum_{m} \frac{\alpha_m}{4\pi} \frac{\mathbf{n}_m \cdot (\mathbf{x} - \mathbf{x}_m)}{|\mathbf{x} - \mathbf{x}_m|^3},

where αm\alpha_m are learned scalar weights, nm\mathbf{n}_m are normals, and xm\mathbf{x}_m are point positions in the point cloud.

  1. Radiance Field: The radiance field representation interpolates appearance attributes through the same dipole sum mechanism, feeding these attributes into a shallow MLP to predict colors. Efficient computation and backpropagation are ensured using fast summation techniques.

Performance and Evaluation

The authors validate the efficacy of their approach through extensive empirical evaluation against several state-of-the-art techniques, including ray tracing of neural representations and rasterization of Gaussian point-based representations. Key results include:

  • Efficiency: The technique achieves notable computational efficiency due to the Barnes-Hut fast summation scheme, enabling inverse rendering at speeds competitive with rasterization.
  • Quality: The proposed method surpasses others in reconstructing detailed and high-quality surfaces, particularly when evaluated on datasets such as DTU and BlendedMVS.
  • Compatibility: The technique maintains compatibility with advanced rendering techniques, including shadow rays, essential for rendering direct illumination.

Implications and Future Directions

The implications of this work are far-reaching both in practical applications and theoretical developments:

  • Practical Applications: By significantly improving the reconstruction quality and efficiency of 3D surfaces from multi-view images, this technique can greatly benefit various fields such as virtual reality, augmented reality, and film production.
  • Theoretical Advances: This research contributes to the understanding of point-based representations and interpolation schemes in computer graphics and vision. It bridges the gap between traditional geometry-based techniques and modern neural rendering methods.
  • Future Research: Future work could explore further enhancements in query efficiency, perhaps by leveraging packet queries for multiple points. Additionally, extending the method to handle dynamic scenes or integrating it with real-time applications could provide substantial advancements.

In conclusion, the introduction of fast dipole sums for 3D reconstruction represents a significant advance in point-based modeling and inverse rendering. By combining robust geometric regularity with computational efficiency, this research opens new avenues for high-quality, scalable 3D reconstruction from multi-view images.