AvatarMe++: Facial Shape and BRDF Inference with Photorealistic Rendering-Aware GANs (2112.05957v1)

Published 11 Dec 2021 in cs.CV and cs.GR

Abstract: Over the last years, many face analysis tasks have accomplished astounding performance, with applications including face generation and 3D face reconstruction from a single "in-the-wild" image. Nevertheless, to the best of our knowledge, there is no method which can produce render-ready high-resolution 3D faces from "in-the-wild" images and this can be attributed to the: (a) scarcity of available data for training, and (b) lack of robust methodologies that can successfully be applied on very high-resolution data. In this work, we introduce the first method that is able to reconstruct photorealistic render-ready 3D facial geometry and BRDF from a single "in-the-wild" image. We capture a large dataset of facial shape and reflectance, which we have made public. We define a fast facial photorealistic differentiable rendering methodology with accurate facial skin diffuse and specular reflection, self-occlusion and subsurface scattering approximation. With this, we train a network that disentangles the facial diffuse and specular BRDF components from a shape and texture with baked illumination, reconstructed with a state-of-the-art 3DMM fitting method. Our method outperforms the existing arts by a significant margin and reconstructs high-resolution 3D faces from a single low-resolution image, that can be rendered in various applications, and bridge the uncanny valley.

Citations (50)

View on Semantic Scholar

Summary

The paper introduces a novel method for reconstructing high-resolution, photorealistic 3D facial models from single 'in-the-wild' images using rendering-aware GANs and a new dataset.
It leverages the novel RealFaceDB dataset and a rendering-aware GAN framework incorporating photorealistic differentiable rendering to accurately infer and separate diffuse and specular reflectance.
The approach enables the creation of realistic 3D facial avatars for computer graphics, virtual reality, and augmented reality applications.

Facial Shape and BRDF Inference with Photorealistic Rendering-Aware GANs: A Technical Overview

The paper "Facial Shape and BRDF Inference with Photorealistic Rendering-Aware GANs" addresses the complex task of reconstructing photorealistic 3D facial models from single "in-the-wild" images. The authors propose an innovative method that significantly advances the accuracy and realism of 3D facial reconstructions, which can be directly utilized for rendering in virtual environments.

At the core of this approach is the first-of-its-kind dataset, RealFaceDB, which includes high-quality facial reflectance data from over 200 subjects. The dataset captures various reflectance properties, including diffuse and specular albedo, and surface normals, offering a robust foundation for training deep learning models to infer fine-grained facial attributes from images. This substantial data collection effort mitigates the scarcity of high-quality training samples, which historically hindered progress in this domain.

To facilitate the reconstruction process, the authors have harnessed Generative Adversarial Networks (GANs) within a rendering-aware framework. This involves a multi-step procedure where an initial 3D Morphable Model (3DMM) fitting algorithm generates a base geometric and texture estimate. Subsequently, a deep image-translation network utilizes photorealistic differentiable rendering losses combined with adversarial and feature-matching losses to refine and separate the baked-in illumination effects from various reflectance components. This methodological design ensures the disentanglement of diffuse and specular characteristics, enabling the production of high-resolution render-ready 3D faces.

One of the pivotal technical contributions is the introduction of a photorealistic differentiable rendering engine within the GAN framework, which outperforms earlier techniques by efficiently simulating subsurface scattering and self-occlusion effects in human skin. This is achieved without resorting to computationally prohibitive global illumination models, thereby maintaining feasible processing times during both training and inference phases. Moreover, the authors implement a novel autoencoder to predict self-occlusions, contributing to the fidelity of rendered 3D models under varying environmental lighting conditions.

The implications of this research are manifold: practically, it enhances the development of realistic avatars for applications in computer graphics, virtual reality, and augmented reality. Theoretically, it offers a scalable approach to overcoming previous limitations in high-frequency detail representation and environmental adaptability in facial reconstruction tasks.

Future research can explore extending this framework to broader facial expressions and motions, potentially integrating dynamic elements into the rendering-aware GAN pipeline. Additionally, expanding the dataset to cover a wider demographic range could further improve model robustness and generalization across diverse facial traits.

In summary, this paper delivers substantial advancements in 3D facial reconstruction technology using GANs, presenting a holistic approach that leverages novel data acquisition, rendering-aware image translation models, and photorealistic differentiable rendering techniques. It sets a benchmark for future developments in rendering-ready facial model synthesis from single images, pushing the boundary on achieving high-detail realism in computer-generated imagery.

Related Papers

YouTube

Show All Videos