Emergent Mind

GO-NeRF: Generating Virtual Objects in Neural Radiance Fields

Published Jan 11, 2024 in cs.CV


Despite advances in 3D generation, the direct creation of 3D objects within an existing 3D scene represented as NeRF remains underexplored. This process requires not only high-quality 3D object generation but also seamless composition of the generated 3D content into the existing NeRF. To this end, we propose a new method, GO-NeRF, capable of utilizing scene context for high-quality and harmonious 3D object generation within an existing NeRF. Our method employs a compositional rendering formulation that allows the generated 3D objects to be seamlessly composited into the scene utilizing learned 3D-aware opacity maps without introducing unintended scene modification. Moreover, we also develop tailored optimization objectives and training strategies to enhance the model's ability to exploit scene context and mitigate artifacts, such as floaters, originating from 3D object generation within a scene. Extensive experiments on both feed-forward and $360o$ scenes show the superior performance of our proposed GO-NeRF in generating objects harmoniously composited with surrounding scenes and synthesizing high-quality novel view images. Project page at {\url{https://daipengwa.github.io/GO-NeRF/}.


  • GO-NeRF introduces a method for adding 3D objects into existing scenes using neural radiance fields for realistic scene composition.

  • It features a user interface for precise virtual object placement and employs context-aware learning to ensure scene compatibility.

  • The technology uses score distillation sampling from diffusion models to maintain context and introduces a regularizer to address saturation balance.

  • Experiments demonstrate GO-NeRF's ability to create harmonious scenes with compatible virtual objects that align well with text prompts.

  • While showcasing potential for various applications, GO-NeRF acknowledges the need for further development to handle dynamic scene elements.


A recent development in the field of 3D object generation within pre-existing scenes has emerged through a method known as GO-NeRF. GO-NeRF stands for "Generating Virtual Objects in Neural Radiance Fields," and it is at the cutting edge of harmoniously integrating virtual 3D objects into existing 3D environments. The overarching goal is to better serve applications in scene creation and editing, where it is crucial to blend newly generated objects seamlessly into the existing backdrop.


The approach is built upon two key components: a compositional rendering formulation and context-aware learning objectives. GO-NeRF introduces an innovative user interface that allows for the precise placement of virtual objects in a given scene by selecting a 3D location based on the scene's depth information. Subsequently, it generates a new object neural radiance field in this location and renders it separately from the scene, followed by a seamless compositional process. Such a method of operation enables the preservation of the integrity of the original scene's content.

To ensure the generated object complements the scene contextually, GO-NeRF leverages 2D image inpainting priors from diffusion models, employing what is known as score distillation sampling. In addition, a regularizer is introduced to harmonize the saturation levels of the generated object with the rest of the scene, which addresses issues related to over-saturation.

Experimentation and Results

The efficacy of GO-NeRF was validated through extensive experiments across various datasets, comparing favorably against alternative methods. The results highlight GO-NeRF’s ability to produce high-quality, context-compatible 3D objects with shadows and reflections that contribute to a harmonious scene. A standout feature is GO-NeRF’s interface, which streamlines the generation process and makes it viable for users without specialized 3D software expertise.

Quantitative evaluations also paint a picture of GO-NeRF’s performance superiority, where it achieves higher CLIP scores – a metric that quantifies the alignment between generated objects and text prompts. This suggests that the virtual objects produced are more in tune with the text descriptions they are based on.

Potential and Future Directions

The implications of GO-NeRF span various applications such as virtual reality, game design, and film production, where accurate and realistic scene construction is essential. GO-NeRF also opens up possibilities for image inpainting and style adaptation, enabling more nuanced and detailed scene editing.

While there are limitations, such as the potential mismatch between the defined 3D box and areas affected by generated objects (e.g., reflections outside the box), the foundation laid by GO-NeRF paves the way for future investigations. Moving forward, dynamic adjustments to specified boxes and addressing SDS loss limitations could further refine this technology.

In summary, GO-NeRF represents a significant advancement in the realm of 3D object generation and scene composition in neural radiance fields, offering exciting opportunities for creating immersive and cohesive 3D environments.

Create an account to read this summary for free:


Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.