Emergent Mind

Abstract

We present En3D, an enhanced generative scheme for sculpting high-quality 3D human avatars. Unlike previous works that rely on scarce 3D datasets or limited 2D collections with imbalanced viewing angles and imprecise pose priors, our approach aims to develop a zero-shot 3D generative scheme capable of producing visually realistic, geometrically accurate and content-wise diverse 3D humans without relying on pre-existing 3D or 2D assets. To address this challenge, we introduce a meticulously crafted workflow that implements accurate physical modeling to learn the enhanced 3D generative model from synthetic 2D data. During inference, we integrate optimization modules to bridge the gap between realistic appearances and coarse 3D shapes. Specifically, En3D comprises three modules: a 3D generator that accurately models generalizable 3D humans with realistic appearance from synthesized balanced, diverse, and structured human images; a geometry sculptor that enhances shape quality using multi-view normal constraints for intricate human anatomy; and a texturing module that disentangles explicit texture maps with fidelity and editability, leveraging semantical UV partitioning and a differentiable rasterizer. Experimental results show that our approach significantly outperforms prior works in terms of image quality, geometry accuracy and content diversity. We also showcase the applicability of our generated avatars for animation and editing, as well as the scalability of our approach for content-style free adaptation.

Overview

  • En3D is a generative model designed to create high-quality 3D human avatars using synthetic 2D data as a starting point.

  • The model uses a workflow that includes a generative module, a geometry sculptor, and a texturing module to produce realistic-looking 3D figures.

  • With optimization modules, En3D enhances the resolution and detail of the 3D shapes and textures it generates.

  • The paper presents three main contributions: a zero-shot generative scheme, a tailored workflow for improved 3D modeling, and detail refinement optimization modules.

  • En3D has been experimentally validated to outperform existing methods in image quality, anatomy fidelity, and adaptability to various styles.

Introduction

In the realm of augmented and virtual reality, as well as in video gaming and telepresence, 3D human avatars are highly significant. The creation of 3D human models hinges on the development of generative models which typically require extensive 3D datasets, a resource both limited and costly. En3D emerges as a key solution that crafts high-quality 3D human avatars from scratch, taking advantage of a new workflow and optimization techniques to produce models with remarkable visual realism, geometric accuracy, and content diversity.

Generative Model Workflow

En3D's approach diverges from past strategies by focusing on a 3D generative model informed by synthetic 2D data. This process entails the generation of structured, view-balanced images using synthetic pose images with known physical parameters. Reflecting this, En3D includes a generative module that yields generalizable 3D human figures with realistic appearance, a geometry sculptor to enhance shape details, and a texturing module for clarity and texture editability.

Enhancements and Contributions

More than just a generative model, En3D introduces enhancements to previous methods. For one, it integrates optimization modules that enhance the resolution of the coarse 3D shapes, ensuring that the final avatar accurately portrays human details and textures. The contributions of the paper are threefold: the establishment of a zero-shot generative scheme, the crafting of a specialized workflow for enhanced 3D modeling, and the integration of optimization modules that refine details in 3D forms and textures alike.

Experimental Validation

Experimental results highlight En3D's ability to outperform existing methods. The strategy demonstrates advanced image quality, precise content diversity, and high fidelity to human anatomy in its 3D avatars. Moreover, the framework displays capabilities for animation and extensive editing, proving its adaptability to a range of styles and content such as portraits and animated characters. En3D thus not only satisfies the current needs for 3D avatars in digital applications but also extends opportunities for future 3D synthesis tasks.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

YouTube