COLMAP-Free 3D Gaussian Splatting (2312.07504v2)

Published 12 Dec 2023 in cs.CV

Abstract: While neural rendering has led to impressive advances in scene reconstruction and novel view synthesis, it relies heavily on accurately pre-computed camera poses. To relax this constraint, multiple efforts have been made to train Neural Radiance Fields (NeRFs) without pre-processed camera poses. However, the implicit representations of NeRFs provide extra challenges to optimize the 3D structure and camera poses at the same time. On the other hand, the recently proposed 3D Gaussian Splatting provides new opportunities given its explicit point cloud representations. This paper leverages both the explicit geometric representation and the continuity of the input video stream to perform novel view synthesis without any SfM preprocessing. We process the input frames in a sequential manner and progressively grow the 3D Gaussians set by taking one input frame at a time, without the need to pre-compute the camera poses. Our method significantly improves over previous approaches in view synthesis and camera pose estimation under large motion changes. Our project page is https://oasisyang.github.io/colmap-free-3dgs

References (58)

Citations (65)

View on Semantic Scholar

Summary

The paper introduces CF-3DGS, a method that eliminates reliance on pre-computed camera poses for on-the-fly 3D reconstruction.
It employs a dual-local-global optimization strategy by sequentially updating 3D Gaussian sets from video frames to enhance synthesis accuracy.
The approach achieves robust performance under wide-ranging camera movements, markedly improving efficiency and quality in 360° video scenes.

Introduction to Novel View Synthesis

The evolving domain of photo-realistic scene reconstruction and novel view synthesis has seen remarkable progress, particularly with the advent of Neural Radiance Fields (NeRF). These developments hinge on upfront computation of camera poses, traditionally derived from Structure-from-Motion (SfM) techniques, such as those provided by the COLMAP library. However, pre-computed camera poses can create bottlenecks and limitations.

Challenges and Innovations

NeRF's implicit scene representations create inherent constraints when simultaneously determining 3D scene structure and camera posing. For instance, methods like 'Nope-NeRF' may face difficulties when camera poses alter substantially, a common case in 360-degree video recordings. Integrating camera pose estimation within the NeRF framework has long been a complex task, akin to the proverbial chicken-and-egg problem.

Enter the method of 3D Gaussian Splatting. It offers a new perspective with its explicit point cloud representations, opening a window for bypassing pre-processed camera pose estimations. Recognizing this potential, a new approach has been proposed—COLMAP-Free 3D Gaussian Splatting (CF-3DGS). This method harnesses the explicit geometric nature of video sequences and the continuity of video frame passages to perform innovative view synthesis without any SfM pre-processing.

CF-3DGS Methodology

CF-3DGS operates by processing input video frames sequentially, allowing the set of 3D Gaussian representations to 'grow' as the camera navigates through space. Each incoming frame leads to an updated local 3D Gaussian set, which is then correlated to a global representation of the scene. This dual-local-global optimization, working with both current and previous frames, contributes to significantly better scene reconstruction and camera pose predictions when benchmarked against other non-SfM methods.

Advantages and Results

The proposed CF-3DGS method sets itself apart by achieving robustness in pose estimation and higher quality in novel view synthesis across various scenes. Not confined to small shifts in camera motion, its performance excels notably when challenged with wide-ranging camera movements, particularly in scenarios of 360-degree video captures. Moreover, it brings efficiency to the process, achieving results on par with state-of-the-art methods like 'Nope-NeRF' with notably more expedient training times.

Future Implications

In the pursuit of mirroring and manipulating reality through technology, methods like CF-3DGS edge us closer to seamless and realistic virtual experiences. Whether for entertainment, simulation, or education, the impacts of such advances open unexplored doors to how we might interact with and visualize our surroundings through the lens of artificial intelligence.

PDF Markdown

Related Papers

GitHub

COLMAP-Free 3D Gaussian Splatting

Tweets

https://twitter.com/22146921/status/1735065032269156709

https://twitter.com/WilliamLamkin/status/1744778549104418886

https://twitter.com/1637708085958696961/status/1734928751241101477

YouTube

Show All Videos