Emergent Mind

Abstract

We present UniPlane, a novel method that unifies plane detection and reconstruction from posed monocular videos. Unlike existing methods that detect planes from local observations and associate them across the video for the final reconstruction, UniPlane unifies both the detection and the reconstruction tasks in a single network, which allows us to directly optimize final reconstruction quality and fully leverage temporal information. Specifically, we build a Transformers-based deep neural network that jointly constructs a 3D feature volume for the environment and estimates a set of per-plane embeddings as queries. UniPlane directly reconstructs the 3D planes by taking dot products between voxel embeddings and the plane embeddings followed by binary thresholding. Extensive experiments on real-world datasets demonstrate that UniPlane outperforms state-of-the-art methods in both plane detection and reconstruction tasks, achieving +4.6 in F-score in geometry as well as consistent improvements in other geometry and segmentation metrics.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.