Emergent Mind

Abstract

3D object recognition has attracted wide research attention in the field of multimedia and computer vision. With the recent proliferation of deep learning, various deep models with different representations have achieved the state-of-the-art performance. Among them, point cloud and multi-view based 3D shape representations are promising recently, and their corresponding deep models have shown significant performance on 3D shape recognition. However, there is little effort concentrating point cloud data and multi-view data for 3D shape representation, which is, in our consideration, beneficial and compensated to each other. In this paper, we propose the Point-View Network (PVNet), the first framework integrating both the point cloud and the multi-view data towards joint 3D shape recognition. More specifically, an embedding attention fusion scheme is proposed that could employ high-level features from the multi-view data to model the intrinsic correlation and discriminability of different structure features from the point cloud data. In particular, the discriminative descriptions are quantified and leveraged as the soft attention mask to further refine the structure feature of the 3D shape. We have evaluated the proposed method on the ModelNet40 dataset for 3D shape classification and retrieval tasks. Experimental results and comparisons with state-of-the-art methods demonstrate that our framework can achieve superior performance.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.