BAE-NET: Branched Autoencoder for Shape Co-Segmentation (1903.11228v2)

Published 27 Mar 2019 in cs.CV, cs.GR, and cs.LG

Abstract: We treat shape co-segmentation as a representation learning problem and introduce BAE-NET, a branched autoencoder network, for the task. The unsupervised BAE-NET is trained with a collection of un-segmented shapes, using a shape reconstruction loss, without any ground-truth labels. Specifically, the network takes an input shape and encodes it using a convolutional neural network, whereas the decoder concatenates the resulting feature code with a point coordinate and outputs a value indicating whether the point is inside/outside the shape. Importantly, the decoder is branched: each branch learns a compact representation for one commonly recurring part of the shape collection, e.g., airplane wings. By complementing the shape reconstruction loss with a label loss, BAE-NET is easily tuned for one-shot learning. We show unsupervised, weakly supervised, and one-shot learning results by BAE-NET, demonstrating that using only a couple of exemplars, our network can generally outperform state-of-the-art supervised methods trained on hundreds of segmented shapes. Code is available at https://github.com/czq142857/BAE-NET.

Authors (5)

Zhiqin Chen (21 papers)
Kangxue Yin (16 papers)
Matthew Fisher (50 papers)
Siddhartha Chaudhuri (40 papers)
Hao Zhang (948 papers)

Citations (133)

View on Semantic Scholar

Summary

The paper introduces a branched autoencoder that achieves unsupervised shape co-segmentation by minimizing shape reconstruction loss.
It leverages a CNN encoder and a specialized branched decoder to learn compact, recurrent shape part features without relying on ground-truth labels.
The network demonstrates strong performance in unsupervised, weakly supervised, and one-shot learning settings compared to standard segmentation models.

Analysis of Bae-Net: A Branched Autoencoder for Shape Co-Segmentation

The paper presents Bae-Net, a novel branched autoencoder network designed for shape co-segmentation tasks. Bae-Net addresses co-segmentation, perceived as a representation learning challenge, and introduces a network architecture that minimizes shape reconstruction loss in an unsupervised manner. Unique to many existing segmentation frameworks, Bae-Net does not rely on ground-truth labels for training, offering capabilities in unsupervised, weakly supervised, and one-shot learning contexts.

Core Methodology

Bae-Net utilizes a branched architecture integrated within an autoencoder framework where the encoder employs a convolutional neural network (CNN) to extract feature codes from input shapes. The decoder combines these feature codes with point coordinates to determine a point's spatial status, effectively reconstructing whether points are inside or outside the target shape. The critical innovation within Bae-Net is its branched decoder, where each branch is dedicated to learning compact representations of recurrent shape parts within a dataset. This approach allows Bae-Net to inherently understand shape structures through part-based learning, which aligns well with human perception theories regarding object recognition.

Performance and Evaluation

Empirical evaluation of Bae-Net spans several learning paradigms:

Unsupervised Learning: The network achieves shape co-segmentation using purely shape reconstruction loss. Bae-Net successfully segments shapes into distinct, consistent parts across large datasets, demonstrating competitive performance in comparison to traditional supervised models, despite utilizing no annotated training data.
Weakly Supervised Learning: By exploiting data distributions altered by weak cues, such as binary part-presence labels, Bae-Net further refines part segmentation. The method achieves higher Area Under the Curve (AUC) scores compared to state-of-the-art weakly supervised techniques such as Tags2Parts, indicating efficacy in identifying parts with minimal supervision.
One-shot Learning: Bae-Net's architecture supports efficient one-shot learning, wherein a minimal set of annotated exemplars guide the segmentation of an entire collection. The network outperforms several prominent supervised methods (e.g., PointNet, PointNet++) that require significantly larger datasets, presenting a compelling alternative in situations with limited annotated data.

Implications and Future Directions

The distributed and compact part-learning abilities of Bae-Net not only allow for effective shape segmentation but also imply substantial reductions in data dependency for training neural networks for similar tasks. The network's adaptability to varying degrees of supervision situates it as a flexible tool within the broader context of geometric deep learning and unsupervised representation learning.

Further developments anticipated from this work may involve extending Bae-Net’s capabilities to finer-grained segmentations by enhancing its architectural depth and efficiently handling high-resolution input. Additionally, incorporating semantic awareness could address the challenge of consistent part labeling across dissimilar categories. The hierarchical part segmentation and training on rotation-invariant shapes can further expand the applicability of Bae-Net across diverse geometries encountered in real-world applications.

As the field of shape analysis progresses, Bae-Net's branched architecture confronts current limitations of data scarcity while contributing meaningfully to the discourse surrounding efficient and effective representation learning for 3D shape analysis.

PDF Markdown

Related Papers

GitHub

GitHub - czq142857/BAE-NET: The code for paper "BAE-NET: Branched Autoencoder for Shape Co-Segmentation". (67 stars)