Multi-view Convolutional Neural Networks for 3D Shape Recognition

Hang Su; Subhransu Maji; Evangelos Kalogerakis; Erik Learned-Miller

arXiv:1505.00880·cs.CV·September 29, 2015·279 cites

Multi-view Convolutional Neural Networks for 3D Shape Recognition

Hang Su, Subhransu Maji, Evangelos Kalogerakis, Erik Learned-Miller

PDF

Open Access

TL;DR

This paper demonstrates that view-based 2D image descriptors learned through CNNs can effectively recognize 3D shapes, outperforming traditional 3D descriptors, especially when multiple views are used, and introduces a novel multi-view CNN architecture.

Contribution

The paper introduces a new multi-view CNN architecture that combines multiple 2D views into a compact shape descriptor, significantly improving 3D shape recognition performance.

Findings

01

Single view recognition outperforms traditional 3D descriptors.

02

Multiple views increase recognition accuracy.

03

The proposed architecture improves recognition performance with fewer views.

Abstract

A longstanding question in computer vision concerns the representation of 3D shapes for recognition: should 3D shapes be represented with descriptors operating on their native 3D formats, such as voxel grid or polygon mesh, or can they be effectively represented with view-based descriptors? We address this question in the context of learning to recognize 3D shapes from a collection of their rendered views on 2D images. We first present a standard CNN architecture trained to recognize the shapes' rendered views independently of each other, and show that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art 3D shape descriptors. Recognition rates further increase when multiple views of the shapes are provided. In addition, we present a novel CNN architecture that combines information from multiple views of a 3D shape into a single and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Image Processing and 3D Reconstruction · Industrial Vision Systems and Defect Detection