Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training (2312.01663v1)

Published 4 Dec 2023 in cs.CV and cs.AI

Abstract: In this paper, we target the adaptive source driven 3D scene editing task by proposing a CustomNeRF model that unifies a text description or a reference image as the editing prompt. However, obtaining desired editing results conformed with the editing prompt is nontrivial since there exist two significant challenges, including accurate editing of only foreground regions and multi-view consistency given a single-view reference image. To tackle the first challenge, we propose a Local-Global Iterative Editing (LGIE) training scheme that alternates between foreground region editing and full-image editing, aimed at foreground-only manipulation while preserving the background. For the second challenge, we also design a class-guided regularization that exploits class priors within the generation model to alleviate the inconsistency problem among different views in image-driven editing. Extensive experiments show that our CustomNeRF produces precise editing results under various real scenes for both text- and image-driven settings.

Citations (6)

View on Semantic Scholar

Summary

The paper presents a novel CustomNeRF model that adapts source inputs for precise foreground editing in photo-realistic 3D scenes.
It introduces a Local-Global Iterative Editing strategy that alternates between local adjustments and global consistency to maintain background details.
It leverages class-guided regularization with T2I models to ensure geometric consistency across multiple viewpoints.

Overview of Neural Radiance Fields and Scene Editing

Neural Radiance Fields (NeRF) have become a critical tool for creating realistic 3D scenes that can be viewed from any angle. This technology uses neural networks to reproduce the complex behaviors of light within a scene, allowing for photorealistic renderings of virtual environments. Advances in NeRF have spurred research into 3D scene editing, where objects within a scene can be textured, styled, or replaced to suit different needs. However, editing 3D scenes directly with approaches like NeRF can be challenging due to the need for accurate manipulation of specific areas, known as foreground regions, and ensuring consistency across different viewpoints.

Adaptive Source Driven 3D Scene Editing

The paper introduces a solution for customized 3D scene editing by incorporating adaptive source input, either in the form of text descriptions or reference images. This allows for the modification of a scene's foreground while keeping its background unchanged, tackling a common difficulty in prior work where changes could inadvertently affect untargeted parts of the scene.

Local-Global Iterative Editing

To overcome the challenge of concentrating edits on the foreground, the authors propose a Local-Global Iterative Editing (LGIE) training scheme. In this scheme, the editing alternates between local stages, focusing on the foreground, and global stages, taking into account the entire scene. This process is facilitated by developing a foreground-aware NeRF that can discern which parts of the scene should be edited. By adjusting the training process to focus on the foreground or the whole scene as needed, the method manages to preserve the original layout and background details.

Class-guided Regularization for Image-driven Editing

Another challenge arises when editing is guided by a single-view reference image, which can lead to inconsistencies when rendering from different perspectives. The authors address this with a class-guided regularization technique, using a Text-to-Image (T2I) model to encode the visual subject from the reference image into a textual prompt. During the editing process, this allows general class priors from the T2I model to guide geometric consistency across views.

Results and Conclusions

The model, named CustomNeRF, is shown to produce precise editing results in various real scenes for both text- and image-driven settings. Extensive experiments reveal that CustomNeRF can effectively modify the specified regions in a photo-realistic manner, demonstrating the potential of LGIE and class-guided regularization in 3D scene editing. The model's contributions offer a significant step in enabling users to customize scenes according to their specific needs or preferences, broadening the accessibility and flexibility of NeRF-based editing tools.

PDF Markdown

Related Papers

GitHub

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training