SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians (2403.10427v2)

Published 15 Mar 2024 in cs.CV

Abstract: Implicit neural representation methods have shown impressive advancements in learning 3D scenes from unstructured in-the-wild photo collections but are still limited by the large computational cost of volumetric rendering. More recently, 3D Gaussian Splatting emerged as a much faster alternative with superior rendering quality and training efficiency, especially for small-scale and object-centric scenarios. Nevertheless, this technique suffers from poor performance on unstructured in-the-wild data. To tackle this, we extend over 3D Gaussian Splatting to handle unstructured image collections. We achieve this by modeling appearance to seize photometric variations in the rendered images. Additionally, we introduce a new mechanism to train transient Gaussians to handle the presence of scene occluders in an unsupervised manner. Experiments on diverse photo collection scenes and multi-pass acquisition of outdoor landmarks show the effectiveness of our method over prior works achieving state-of-the-art results with improved efficiency.

References (1)

Lassner, C.: Fast differentiable raycasting for neural rendering using sphere-based representations. CoRR (2020)

Citations (19)

View on Semantic Scholar

Summary

The paper introduces an appearance-conditioned extension to 3D Gaussian Splatting that models local photometric variations for improved view synthesis.
The paper employs transient object modeling through learned opacity variations to effectively separate static and moving scene elements.
The paper demonstrates superior reconstruction quality on benchmarks by achieving higher PSNR, SSIM, and LPIPS metrics, advancing real-time applications.

SWAG: Enhancing 3D Gaussian Splatting for Unconstrained Photo Collections with Appearance Variability Modeling

Introduction

The synthesis of novel viewpoints (NVS) and 3D scene reconstruction from unconstrained photo collections remains a key challenge in computer vision and graphics. Despite advancements with methods like Neural radiance Fields (NeRF) and its variants designed for in-the-wild scenarios, limitations persist, especially regarding computational efficiency and handling transient objects. Recently, 3D Gaussian Splatting (3DGS) has emerged as a promising alternative owing to its explicit representation and GPU-based rasterization benefits, offering faster training and rendering. However, its performance on unstructured in-the-wild data has been suboptimal. In this context, the paper presents SWAG, a novel approach that extends 3DGS to effectively handle appearance variations and transient objects typical in unconstrained photo collections, thereby advancing state-of-the-art in NVS under such challenging conditions.

Related Work

The exploration of neural rendering within unconstrained environments has been the focus of several studies. Methods like NeRF-W and Ha-NeRF have made strides in adapting NeRF to handle varying appearances and transient occluders using combinations of embeddings and visibility maps. Yet, the computational overhead of these methods makes real-time rendering elusive. On the other hand, point-based rendering techniques, including 3DGS, have demonstrated real-time rendering capabilities but grapple with aliasing issues and scene appearance variation challenges. These methodologies set the stage for the introduction of SWAG, aiming to address these limitations by integrating appearance conditioning and transient object modeling within the 3DGS framework.

Methodology

SWAG introduces two primary innovations to tackle the challenges posed by in-the-wild photo collections:

Appearance Variation Modeling: Utilizing an MLP network, SWAG models local appearance variations across different images. By encoding each image's appearance into a learnable embedding and coupling it with a positional encoding of the Gaussian's centers, SWAG effectively adapts the color of 3D Gaussians to reflect photometric variations inherent in unstructured image collections.
Transient Gaussians Modeling: To manage transient objects, SWAG employs an approach to learn image-dependent opacity variations for each Gaussian. These variations allow for the dynamic representation of occluders in some images, enabling their exclusion in others, thus achieving clear disentanglement between static and transient scene elements.

Experimental Evaluation

SWAG was rigorously evaluated against benchmarks and other state-of-the-art methods on datasets such as the Phototourism dataset and NeRF-OSR. Numerical results highlighted significant improvements in quality metrics like PSNR, SSIM, and LPIPS across various scenes, confirming SWAG's superior rendering quality and efficiency. Visual comparisons further demonstrate SWAG's capability to reconstruct scenes with high fidelity to the original appearance and without transient occluders.

Implications and Future Directions

SWAG represents a significant step forward for 3D scene reconstruction from unconstrained photo collections, demonstrating not only the ability to model appearance changes but also to distinguish between transient and static scene components. The method opens avenues for further research, including exploring dynamic scene representations and integrating more advanced machine learning techniques to refine transient object modeling. Additionally, the real-time rendering capability of SWAG coupled with its efficiency and quality presents a compelling case for its application in various practical scenarios, such as virtual tourism and interactive 3D modeling.

Conclusion

In summary, SWAG successfully extends 3D Gaussian Splatting to effectively utilize in-the-wild photo collections for novel view synthesis and 3D scene reconstruction. By innovatively addressing appearance variation and transient occluders, SWAG sets a new benchmark for efficiency and quality in the field, paving the way for future advancements in neural rendering technologies.

PDF Markdown

Related Papers

Tweets

https://twitter.com/zhenjun_zhao/status/1769610527754559781

https://twitter.com/fly51fly/status/1769849351713837084

https://twitter.com/gm8xx8/status/1769541707975033226