AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction (2407.02598v2)

Published 2 Jul 2024 in cs.CV and cs.AI

Abstract: Realistic scene reconstruction and view synthesis are essential for advancing autonomous driving systems by simulating safety-critical scenarios. 3D Gaussian Splatting excels in real-time rendering and static scene reconstructions but struggles with modeling driving scenarios due to complex backgrounds, dynamic objects, and sparse views. We propose AutoSplat, a framework employing Gaussian splatting to achieve highly realistic reconstructions of autonomous driving scenes. By imposing geometric constraints on Gaussians representing the road and sky regions, our method enables multi-view consistent simulation of challenging scenarios including lane changes. Leveraging 3D templates, we introduce a reflected Gaussian consistency constraint to supervise both the visible and unseen side of foreground objects. Moreover, to model the dynamic appearance of foreground objects, we estimate residual spherical harmonics for each foreground Gaussian. Extensive experiments on Pandaset and KITTI demonstrate that AutoSplat outperforms state-of-the-art methods in scene reconstruction and novel view synthesis across diverse driving scenarios. Visit our project page at https://autosplat.github.io/.

Citations (4)

View on Semantic Scholar

Summary

The paper introduces a novel framework that leverages constrained Gaussian splatting to enforce geometric consistency in static backgrounds and realistic reconstruction of dynamic objects.
It employs 3D template-based initialization and reflected Gaussian constraints to accurately model road, sky, and moving foreground elements from sparse sensor views.
Evaluations demonstrate superior PSNR, SSIM, and LPIPS performance, enabling high-quality novel view synthesis during complex driving maneuvers.

AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction

Introduction

The paper presents AutoSplat, a novel framework designed to reconstruct, and render autonomous driving scenes leveraging a technique called 3D Gaussian Splatting (3DGS). The method addresses key challenges in simulating complex, dynamic environments encountered in autonomous driving, such as moving objects and sparse views captured by sensors on the ego-vehicle. Previous 3DGS methods have been adept at handling static scenes but falter when applied to dynamic settings intrinsic to autonomous driving. This work rigorously defines geometric constraints for background regions and incorporates novel strategies for handling dynamic foreground objects through advanced initialization and modeling techniques.

Contributions

AutoSplat introduces several key innovations to improve the accuracy and consistency of scene reconstructions and view synthesis:

Background Decomposition and Geometric Constraints: The paper constrains the Gaussian representations of the road and sky regions to enforce multi-view consistency. This significantly mitigates distortions during novel view synthesis, particularly for scenarios such as lane changes.
3D Template-Based Foreground Initialization: Utilizing 3D templates for initializing Gaussian distributions that represent foreground objects, AutoSplat enhances the realism of dynamic object reconstructions.
Reflected Gaussian Consistency Constraint: This component supervises the symmetrically unseen parts of foreground objects by reflecting Gaussian points across their symmetry planes, ensuring coherent synthesis from different viewpoints.
Dynamic Appearance Modeling: Each foreground Gaussian adapts to temporal changes by estimating residual spherical harmonics, enabling the framework to capture dynamic visual characteristics such as flashing lights and moving shadows.

Approach

AutoSplat's method comprises several phases: background reconstruction, foreground object initialization and reconstruction, and finally, scene-level fusion.

Background Reconstruction

Background reconstruction is critical due to sparse sensor placements and the complexity of autonomous driving scenes. The method involves decomposing the scene into road, sky, and other regions, constraining Gaussians geometrically to remain flat in the road and sky sections. This is essential for achieving multi-view consistent performance, especially during changes in the ego-vehicle’s trajectory.

Foreground Object Reconstruction

Foreground objects pose significant challenges due to their dynamic nature and sparse visibility. AutoSplat addresses these through:

3D Template Initialization: By leveraging detailed 3D templates and fitting them to the observed data, the method ensures a robust initial placement of Gaussians, which is crucial for accurate dynamic object reconstruction.
Reflection-Based Consistency: Enforcing symmetry constraints on Gaussian placements ensures that unseen parts of objects are accurately represented.
Modeling Dynamic Appearances: By estimating temporally dependent residuals for each Gaussian's appearance attributes, AutoSplat effectively adapts to dynamic changes in the scene.

Experimental Evaluation

The method demonstrates superior performance across various standard metrics on datasets such as Pandaset and KITTI. Specifically:

Scene Reconstruction: AutoSplat achieves top-tier PSNR, SSIM, and LPIPS scores, indicating high fidelity in reconstructing static and dynamic elements of the scene.
Novel View Synthesis: The framework excels in generating high-quality novel views, particularly during complex maneuvers like lateral lane changes of the ego-vehicle. Evaluations using FID scores show notable improvements over state-of-the-art methods.

Implications and Future Directions

The practical implications of AutoSplat are significant for the field of autonomous driving. By providing enhanced scene reconstructions and realistic view synthesis, this framework can substantially improve the safety and efficacy of self-driving vehicles, especially in simulation environments critical for testing and validation. Additionally, the theoretical contributions of AutoSplat in leveraging constrained Gaussian splatting and reflected consistency constraints bear potential applications in other computer vision domains requiring dynamic scene representations.

While the current paper focuses on rigid objects, extending this method to handle non-rigid objects like pedestrians and cyclists represents an exciting avenue for future research. Moreover, further studies could leverage temporal motion information to reduce dependency on external ground-truth data, which could simplify the process and make it more adaptable to real-world deployment.

Conclusion

AutoSplat stands as a comprehensive framework tailored for the rigorous demands of autonomous driving scene reconstruction and novel view synthesis. Through innovative approaches such as geometric constraints, template-based initialization, and dynamic appearance modeling, the framework significantly advances the capability to simulate complex driving scenarios. This paper provides a robust foundation for future exploration and improvement in autonomous driving technologies, making a meaningful contribution to the field of computer vision and artificial intelligence.

PDF Markdown

Related Papers

Tweets

https://twitter.com/janusch_patas/status/1808839702536925239

https://twitter.com/fly51fly/status/1808982242087383231

https://twitter.com/arxivsanitybot/status/1809047295750779142