Emergent Mind

OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering

(2404.08449)
Published Apr 12, 2024 in cs.CV

Abstract

Rendering dynamic 3D human from monocular videos is crucial for various applications such as virtual reality and digital entertainment. Most methods assume the people is in an unobstructed scene, while various objects may cause the occlusion of body parts in real-life scenarios. Previous method utilizing NeRF for surface rendering to recover the occluded areas, but it requiring more than one day to train and several seconds to render, failing to meet the requirements of real-time interactive applications. To address these issues, we propose OccGaussian based on 3D Gaussian Splatting, which can be trained within 6 minutes and produces high-quality human renderings up to 160 FPS with occluded input. OccGaussian initializes 3D Gaussian distributions in the canonical space, and we perform occlusion feature query at occluded regions, the aggregated pixel-align feature is extracted to compensate for the missing information. Then we use Gaussian Feature MLP to further process the feature along with the occlusion-aware loss functions to better perceive the occluded area. Extensive experiments both in simulated and real-world occlusions, demonstrate that our method achieves comparable or even superior performance compared to the state-of-the-art method. And we improving training and inference speeds by 250x and 800x, respectively. Our code will be available for research purposes.

Framework initializes 3D Gaussians, blends to pose space, queries features in occlusions, predicts coefficients and opacity.

Overview

  • OccGaussian leverages 3D Gaussian Splatting for rendering high-quality dynamic 3D humans from monocular videos, even in occluded scenarios.

  • It achieves training speeds 250 times faster and rendering up to 160 FPS, significantly improving efficiency without sacrificing quality.

  • Through methodological innovations like 3D Gaussian Forward Skinning and Occlusion Feature Query, it enhances rendering quality in occluded areas.

  • Demonstrated superior performance on ZJU-MoCap and OcMotion datasets, offering promising applications in virtual reality and digital entertainment.

3D Gaussian Splatting for Occluded Human Rendering: A Study on OccGaussian

Introduction

Rendering dynamic 3D humans from monocular videos is crucial for virtual reality and digital entertainment. However, occlusion poses a significant challenge, as conventional methods struggle to maintain high-quality renderings when parts of the human body are obstructed. The recently introduced OccGaussian method addresses these limitations by leveraging 3D Gaussian Splatting, achieving rapid training and real-time rendering while rendering high-quality human figures in occluded scenarios.

Technical Summary

OccGaussian initializes 3D Gaussian distributions in the canonical space and conducts occlusion feature queries in occluded regions. It then utilizes Gaussian Feature MLP to process the aggregated pixel-align features extracted to compensate for missing information. Remarkably, OccGaussian achieves training speeds 250 times faster than its predecessors and can render at up to 160 FPS, an 800 times improvement. This efficiency does not compromise quality, as the method demonstrates comparable or superior performance against state-of-the-art methods.

Methodological Innovations

  • 3D Gaussian Forward Skinning: Adapts the 3D Gaussian Splatting technique for occluded human rendering, leveraging the efficiency of 3DGS while ensuring high-quality renderings of dynamic human figures under occlusion.
  • Occlusion Feature Query: Implements K-nearest feature query in occluded regions, followed by the extraction of aggregated pixel-align features to effectively utilize local information and compensate for the absence of ground truth in these areas.
  • Gaussian Feature MLP: Further processes the features of occluded regions, predicting spherical harmonic coefficients and opacity values through MLP, enhancing the rendering quality in occluded areas.

Experimental Insights

The effectiveness of OccGaussian is demonstrated through rigorous experiments on the ZJU-MoCap and OcMotion datasets, showcasing superior performance in rendering quality, training speed, and rendering framerate. The method not only achieves state-of-the-art rendering quality but does so with remarkable improvements in efficiency, making it particularly suitable for real-time applications.

Practical Implications and Future Prospects

OccGaussian represents a significant advancement in the realm of 3D human rendering, particularly for scenarios complicated by occlusions. The method's efficiency and quality make it an appealing option for a wide range of applications, from virtual try-on and augmented reality to virtual production in films.

Future research may explore incorporating temporal information to enhance the reconstruction of severely occluded regions, a limitation currently faced by OccGaussian. Additionally, improving the method's robustness to inaccuracies in pose and camera parameters could extend its applicability to in-the-wild videos. The remarkable improvements in efficiency and rendering quality position OccGaussian as a promising avenue for future developments in the field of 3D human rendering.

Conclusion

OccGaussian introduces a novel approach to rendering occluded humans in monocular videos by leveraging 3D Gaussian Splatting. Its efficiency in training and rendering, combined with its ability to produce high-quality renderings in the presence of occlusions, marks a notable advancement in the field. As the method opens new doors for real-time applications and beyond, OccGaussian is poised to drive further innovations in 3D human rendering technology.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.