RotationNet: Joint Object Categorization and Pose Estimation Using   Multiviews from Unsupervised Viewpoints

Asako Kanezaki; Yasuyuki Matsushita; Yoshifumi Nishida

arXiv:1603.06208·cs.CV·March 26, 2018·50 cites

RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints

Asako Kanezaki, Yasuyuki Matsushita, Yoshifumi Nishida

PDF

Open Access 1 Repo

TL;DR

RotationNet is a CNN model that jointly estimates object category and pose from multi-view images, learning viewpoint labels unsupervised and performing well even with partial views, outperforming state-of-the-art methods.

Contribution

It introduces an unsupervised viewpoint learning approach within a CNN for joint object categorization and pose estimation using multi-view images.

Findings

01

Outperforms state-of-the-art 3D classification methods on ModelNet datasets.

02

Achieves state-of-the-art pose estimation accuracy without known viewpoints.

03

Effective with partial multi-view inputs in practical scenarios.

Abstract

We propose a Convolutional Neural Network (CNN)-based model "RotationNet," which takes multi-view images of an object as input and jointly estimates its pose and object category. Unlike previous approaches that use known viewpoint labels for training, our method treats the viewpoint labels as latent variables, which are learned in an unsupervised manner during the training using an unaligned object dataset. RotationNet is designed to use only a partial set of multi-view images for inference, and this property makes it useful in practical scenarios where only partial views are available. Moreover, our pose alignment strategy enables one to obtain view-specific feature representations shared across classes, which is important to maintain high accuracy in both object categorization and pose estimation. Effectiveness of RotationNet is demonstrated by its superior performance to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kanezaki/rotationnet
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · 3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis