Emergent Mind

Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis

(2312.16812)
Published Dec 28, 2023 in cs.CV and cs.GR

Abstract

Novel view synthesis of dynamic scenes has been an intriguing yet challenging problem. Despite recent advancements, simultaneously achieving high-resolution photorealistic results, real-time rendering, and compact storage remains a formidable task. To address these challenges, we propose Spacetime Gaussian Feature Splatting as a novel dynamic scene representation, composed of three pivotal components. First, we formulate expressive Spacetime Gaussians by enhancing 3D Gaussians with temporal opacity and parametric motion/rotation. This enables Spacetime Gaussians to capture static, dynamic, as well as transient content within a scene. Second, we introduce splatted feature rendering, which replaces spherical harmonics with neural features. These features facilitate the modeling of view- and time-dependent appearance while maintaining small size. Third, we leverage the guidance of training error and coarse depth to sample new Gaussians in areas that are challenging to converge with existing pipelines. Experiments on several established real-world datasets demonstrate that our method achieves state-of-the-art rendering quality and speed, while retaining compact storage. At 8K resolution, our lite-version model can render at 60 FPS on an Nvidia RTX 4090 GPU. Our code is available at https://github.com/oppo-us-research/SpacetimeGaussians.

Overview

  • The paper introduces Spacetime Gaussian Feature Splatting, a novel representation for dynamic scene rendering in real-time.

  • This method integrates Spacetime Gaussians, Splatted Feature Rendering, and Guided Sampling to balance quality, speed, and storage.

  • The representation allows for rendering 8K resolution videos at 60 frames per second on advanced hardware.

  • Experiments show the method surpasses previous technologies in rendering quality and compactness.

  • The research identifies current limitations, like reliance on multi-view inputs, and suggests potential future improvements.

Introduction to Spacetime Gaussian Feature Splatting

Rendering photorealistic views of dynamic scenes in real-time has been a significant challenge in the field of computer vision and graphics. Achieving a combination of high-resolution, real-time rendering, and compact storage remains particularly demanding. Current technologies enabling users to explore dynamic scenes with novel viewpoints are of great interest due to their applications in virtual and augmented reality, broadcasting, and education.

Innovations in Dynamic View Synthesis

A recent development addresses the intricate balance between rendering quality, speed, and storage efficiency. A new dynamic scene representation, termed Spacetime Gaussian Feature Splatting, has been proposed, incorporating three innovative components:

  1. Spacetime Gaussians: A novel approach which extends the concept of 3D Gaussians by incorporating temporal opacity and parametric motion/rotation into the traditional model. This allows for capturing static and dynamic features as well as transient content, which can consist of objects emerging or vanishing over time.
  2. Splatted Feature Rendering: This new technique forgoes spherical harmonics and instead utilizes neural features, which are smaller in size but offer robust expressiveness. These features handle view- and time-dependent appearances, contributing to the model's compactness.
  3. Guided Sampling: The optimization process is improved by sampling new Gaussians in areas that are difficult to render well, particularly those that are sparsely covered or located at a distance. This is guided by training error and coarse depth information, enhancing the rendering quality in complex scenes.

State-of-the-Art Performance

Experiments performed using this new representation have shown that it achieves remarkable results in terms of rendering quality and speed, even while maintaining a small model size. At a high 8K resolution, the model could render videos at 60 frames per second when tested on powerful hardware (Nvidia RTX 4090 GPU).

Contributions and Applications

This research presents several notable contributions:

  • A Spacetime Gaussian model which efficiently renders dynamic views with high fidelity.
  • A new rendering technique based on neural features rather than traditional spherical harmonics, enhancing the model's compactness.
  • An innovative sampling method that refines rendering quality by focusing on challenging areas.
  • Extensive testing on various real-world datasets demonstrates that the method surpasses current art in rendering quality and speed while ensuring a compact model size.

Conclusion and Future Work

The introduction of Spacetime Gaussian Feature Splatting marks a significant advance in dynamic view synthesis. By addressing the key challenges of rendering quality, speed, and model compactness, this technology is poised to enhance user experiences across multiple applications. However, the representation is not without limitations; it currently requires multi-view video inputs and cannot be trained on-the-fly. Future explorations may include adapting the model for monocular settings and improving its training efficiency to support streaming applications.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.