TRIPS: Trilinear Point Splatting for Real-Time Radiance Field Rendering (2401.06003v2)
Abstract: Point-based radiance field rendering has demonstrated impressive results for novel view synthesis, offering a compelling blend of rendering quality and computational efficiency. However, also latest approaches in this domain are not without their shortcomings. 3D Gaussian Splatting [Kerbl and Kopanas et al. 2023] struggles when tasked with rendering highly detailed scenes, due to blurring and cloudy artifacts. On the other hand, ADOP [R\"uckert et al. 2022] can accommodate crisper images, but the neural reconstruction network decreases performance, it grapples with temporal instability and it is unable to effectively address large gaps in the point cloud. In this paper, we present TRIPS (Trilinear Point Splatting), an approach that combines ideas from both Gaussian Splatting and ADOP. The fundamental concept behind our novel technique involves rasterizing points into a screen-space image pyramid, with the selection of the pyramid layer determined by the projected point size. This approach allows rendering arbitrarily large points using a single trilinear write. A lightweight neural network is then used to reconstruct a hole-free image including detail beyond splat resolution. Importantly, our render pipeline is entirely differentiable, allowing for automatic optimization of both point sizes and positions. Our evaluation demonstrate that TRIPS surpasses existing state-of-the-art methods in terms of rendering quality while maintaining a real-time frame rate of 60 frames per second on readily available hardware. This performance extends to challenging scenarios, such as scenes featuring intricate geometry, expansive landscapes, and auto-exposed footage. The project page is located at: https://lfranke.github.io/trips/
- Point-based computer graphics. In ACM SIGGRAPH 2004 Course Notes. 2004, pp. 7–es.
- Neural point-based graphics. In European Conference on Computer Vision (2020), Springer, pp. 696–712.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (October 2021), pp. 5855–5864.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), pp. 5470–5479.
- Zip-nerf: Anti-aliased grid-based neural radiance fields. arXiv preprint arXiv:2304.06706 (2023).
- Stereo radiance fields (srf): Learning view synthesis for sparse views of novel scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 7911–7920.
- Depth synthesis and local warps for plausible image-based navigation. ACM Transactions on Graphics (TOG) 32, 3 (2013), 1–12.
- Tensorf: Tensorial radiance fields. In Computer Vision – ECCV 2022 (Cham, 2022), Avidan S., Brostow G., Cissé M., Farinella G. M., Hassner T., (Eds.), Springer Nature Switzerland, pp. 333–350.
- Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision (2021), pp. 14124–14133.
- Bundlefusion: Real-time globally consistent 3d reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics (ToG) 36, 4 (2017), 1.
- Efficient view-dependent ibr with projective texture-mapping. In EG Rendering Workshop (1998), vol. 4.
- Multi-layer depth of field rendering with tiled splatting. Proceedings of the ACM on Computer Graphics and Interactive Techniques 1, 1 (2018), 1–17.
- Plenoxels: Radiance fields without neural networks. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 5491–5500. doi:10.1109/CVPR52688.2022.00542.
- Deepstereo: Learning to predict new views from the world’s imagery. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 5515–5524.
- Grossman J. P., Dally W. J.: Point sample rendering. In Eurographics Workshop on Rendering Techniques (1998), Springer, pp. 181–192.
- Deepwarp: Photorealistic image resynthesis for gaze manipulation. In European conference on computer vision (2016), Springer, pp. 311–326.
- Multi-view stereo for community photo collections. In 2007 IEEE 11th International Conference on Computer Vision (2007), IEEE, pp. 1–8.
- Deep blending for free-viewpoint image-based rendering. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1–15.
- Baking neural radiance fields for real-time view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision (2021), pp. 5875–5884.
- Perceptual losses for real-time style transfer and super-resolution. CoRR abs/1603.08155 (2016). URL: http://arxiv.org/abs/1603.08155.
- Kobbelt L., Botsch M.: A survey of point-based techniques in computer graphics. Computers & Graphics 28, 6 (2004), 801–814.
- Kopanas G., Drettakis G.: Improving NeRF Quality by Progressive Camera Placement for Free-Viewpoint Navigation. In Vision, Modeling, and Visualization (2023), Guthe M., Grosch T., (Eds.), The Eurographics Association. doi:10.2312/vmv.20231222.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics 42, 4 (2023).
- Real-time 3D reconstruction in dynamic scenes using point-based fusion. In Proc. of Joint 3DIM/3DPVT Conference (3DV) (June 2013), pp. 1–8. Selected for oral presentation.
- Neural point catacaustics for novel-view synthesis of reflections. ACM Transactions on Graphics (TOG) 41, 6 (2022), 1–15.
- Point-based neural rendering with per-view optimization. In Computer Graphics Forum (2021), vol. 40, Wiley Online Library, pp. 29–43.
- Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics 36, 4 (2017).
- Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 3 (2022), 3292–3310.
- Lassner C., Zollhöfer M.: Pulsar: Efficient sphere-based neural rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2021), pp. 1440–1449.
- Nerf in the wild: Neural radiance fields for unconstrained photo collections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 7210–7219.
- Instant neural graphics primitives with a multiresolution hash encoding. arXiv preprint arXiv:2201.05989 (2022).
- Neural rerendering in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 6878–6887.
- Efficient point-based rendering using image reconstruction. In PBG@ Eurographics (2007), pp. 101–108.
- Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1–14.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM 65, 1 (2021), 99–106.
- Müller T.: tiny-cuda-nn, 4 2021. URL: https://github.com/NVlabs/tiny-cuda-nn.
- Donerf: Towards real-time rendering of compact neural radiance fields using depth oracle networks. In Computer Graphics Forum (2021), vol. 40, Wiley Online Library, pp. 45–59.
- Real-time rendering of massive unstructured raw point clouds using screen-space operators. In Proceedings of the 12th International conference on Virtual Reality, Archaeology and Cultural Heritage (2011), pp. 105–112.
- Penner E., Zhang L.: Soft 3d reconstruction for view synthesis. ACM Transactions on Graphics (TOG) 36, 6 (2017), 1–11.
- Surfels: Surface elements as rendering primitives. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques (2000), pp. 335–342.
- NPBG++: accelerating neural point-based graphics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (June 2022), pp. 15969–15979.
- Adop: Approximate differentiable one-pixel point rendering. ACM Transactions on Graphics (TOG) 41, 4 (2022), 1–14.
- Riegler G., Koltun V.: Free view synthesis. In European Conference on Computer Vision (2020), Springer, pp. 623–640.
- Riegler G., Koltun V.: Stable view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 12216–12225.
- Neat: Neural adaptive tomography. ACM Trans. Graph. 41, 4 (jul 2022). URL: https://doi.org/10.1145/3528223.3530121, doi:10.1145/3528223.3530121.
- Deep novel view synthesis from colored 3d point clouds. In European Conference on Computer Vision (2020), Springer, pp. 1–17.
- Schonberger J. L., Frahm J.-M.: Structure-from-motion revisited. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 4104–4113.
- Shum H., Kang S. B.: Review of image-based rendering techniques. In Visual Communications and Image Processing 2000 (2000), vol. 4067, SPIE, pp. 2–13.
- Real-time continuous level of detail rendering of point clouds. In 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR) (2019), IEEE, pp. 103–110.
- Rendering point clouds with compute shaders and vertex order optimization. In Computer Graphics Forum (2021), vol. 40, Wiley Online Library, pp. 115–126.
- Software rasterization of 2 billion points in real time. arXiv preprint arXiv:2204.01287 (2022).
- Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems 33 (2020).
- Photo tourism: exploring photo collections in 3d. In ACM siggraph 2006 papers. 2006, pp. 835–846.
- Pushing the boundaries of view extrapolation with multiplane images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 175–184.
- Deepvoxels: Learning persistent 3d feature embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 2437–2446.
- Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV) (2016).
- Learned initializations for optimizing coordinate-based neural representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 2846–2855.
- Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), pp. 12922–12931.
- Tucker R., Snavely N.: Single-view view synthesis with multiplane images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 551–560.
- Advances in neural rendering. In Computer Graphics Forum (2022), vol. 41, Wiley Online Library, pp. 703–735.
- Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1–12.
- A survey of multifragment rendering. In Computer Graphics Forum (2020), vol. 39, Wiley Online Library, pp. 623–642.
- Synsin: End-to-end view synthesis from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 7467–7477.
- Elasticfusion: Real-time dense slam and light source estimation. The International Journal of Robotics Research 35, 14 (2016), 1697–1716.
- Surfelgan: Synthesizing realistic sensor data for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020).
- Plenoctrees for real-time rendering of neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision (2021), pp. 5752–5761.
- Free-form image inpainting with gated convolution. In Proceedings of the IEEE/CVF international conference on computer vision (2019), pp. 4471–4480.
- Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1–14.
- pixelnerf: Neural radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 4578–4587.
- Differentiable point-based radiance fields for efficient view synthesis. arXiv preprint arXiv:2205.14330 (2022).
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR (2018).
- Surface splatting. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques (2001), pp. 371–378.
- Stereo magnification: Learning view synthesis using multiplane images. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–12.
- View synthesis by appearance flow. In European conference on computer vision (2016), Springer, pp. 286–301.