Instance Neural Radiance Field (2304.04395v3)

Published 10 Apr 2023 in cs.CV

Abstract: This paper presents one of the first learning-based NeRF 3D instance segmentation pipelines, dubbed as Instance Neural Radiance Field, or Instance NeRF. Taking a NeRF pretrained from multi-view RGB images as input, Instance NeRF can learn 3D instance segmentation of a given scene, represented as an instance field component of the NeRF model. To this end, we adopt a 3D proposal-based mask prediction network on the sampled volumetric features from NeRF, which generates discrete 3D instance masks. The coarse 3D mask prediction is then projected to image space to match 2D segmentation masks from different views generated by existing panoptic segmentation models, which are used to supervise the training of the instance field. Notably, beyond generating consistent 2D segmentation maps from novel views, Instance NeRF can query instance information at any 3D point, which greatly enhances NeRF object segmentation and manipulation. Our method is also one of the first to achieve such results in pure inference. Experimented on synthetic and real-world NeRF datasets with complex indoor scenes, Instance NeRF surpasses previous NeRF segmentation works and competitive 2D segmentation methods in segmentation performance on unseen views. Watch the demo video at https://youtu.be/wW9Bme73coI. Code and data are available at https://github.com/lyclyc52/Instance_NeRF.

Citations (34)

View on Semantic Scholar

Summary

The paper introduces a novel Instance-NeRF architecture that integrates 3D proposal-based mask predictions with volumetric features from a pre-trained NeRF.
The approach achieves multi-view consistent segmentation by refining coarse 3D predictions with panoptic segmentation models without explicit 3D geometry.
Experiments on synthetic and real-world datasets validate that Instance-NeRF outperforms prior methods, setting a new benchmark for 3D instance segmentation.

Introduction

Instance segmentation, the task distinguishing individual objects within a scene, is essential for scene understanding and manipulation. While 2D instance segmentation has shown great success, generalizing to 3D poses a distinct set of challenges, wrenching the task from the convenience of abundant training data inherent to image-based tasks. The paper presents Instance Neural Radiance Field (Instance-NeRF), a novel method for learning 3D instance segmentation using the paradigms established by NeRF.

Methodology

At the core of this approach lies the ingenious integration of 3D proposal-based mask prediction networks with sampled volumetric features from a pre-trained NeRF. Instance-NeRF is capable of generating discrete 3D instance masks, further contributing to a more nuanced image space projection. Leveraging panoptic segmentation models, 2D segmentation masks are matched to these projections, refining what is already derived from heritage methods.

This setup leads to two substantial distinctions. Firstly, Instance-NeRF ensures segmentation consistency across views without necessitating explicit 3D geometry as input. Secondly, unlike previous works, Instance-NeRF operates in inference without conventional reliance on ground truth labels for 3D segmentation.

Architectural Innovations

Instance-NeRF consists of key architectural components: a pre-trained NeRF for parsing radiance and density fields, and an instance field representing 3D instance information. The Instance-NeRF extends the existing NeRF representation by adding an instance branch, enlightening the network with the ability to delineate objects in 3D scene structures.

Detailed in the methodology are the NeRF-RCNN and a refinement mechanism that projects coarse 3D segmentation into 2D, refined by consistency matching across views. The architecture also encapsulates a neural instance field, producing multi-view consistent 2D segmentations alongside continuous 3D segmentation.

Experimental Validation

Substantiated by experiments on synthetic and real-world datasets, including the complex indoor scenes of 3D-FRONT, Instance-NeRF yields superior segmentation performance compared to previous NeRF segmentation initiatives and stands tall against competitive 2D segmentation methods on unseen views.

The contribution of this paper, therefore, is threefold: proposing a novel architecture for 3D instance segmentation in NeRF, detailing the training approach for a Neural Instance Field, and showcasing effectiveness through experiments and ablation studies. The instantiation of a method yielding both multi-view consistent 2D segmentation and continuous 3D segmentation from a NeRF representation is among the first of its kind, and the code is publicly available for broader use and development in the research community.

Conclusion

Instance-NeRF heralds a new avenue for exploring 3D instance segmentation intricately tied with the rich and continuous representation provided by NeRF. Its ability to query instance information at any 3D position not only advances NeRF's usability for segmentation and manipulation but undeniably pushes boundaries that blend 2D image segmentation success with 3D geometric understanding. This paper marks a milestone for future efforts in 3D instance segmentation, promising applications in complex real-world scenarios where understanding and manipulating the intricate details of 3D spaces is paramount.