RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints
Asako Kanezaki, Yasuyuki Matsushita, Yoshifumi Nishida

TL;DR
RotationNet is a CNN model that jointly estimates object category and pose from multi-view images, learning viewpoint labels unsupervised and performing well even with partial views, outperforming state-of-the-art methods.
Contribution
It introduces an unsupervised viewpoint learning approach within a CNN for joint object categorization and pose estimation using multi-view images.
Findings
Outperforms state-of-the-art 3D classification methods on ModelNet datasets.
Achieves state-of-the-art pose estimation accuracy without known viewpoints.
Effective with partial multi-view inputs in practical scenarios.
Abstract
We propose a Convolutional Neural Network (CNN)-based model "RotationNet," which takes multi-view images of an object as input and jointly estimates its pose and object category. Unlike previous approaches that use known viewpoint labels for training, our method treats the viewpoint labels as latent variables, which are learned in an unsupervised manner during the training using an unaligned object dataset. RotationNet is designed to use only a partial set of multi-view images for inference, and this property makes it useful in practical scenarios where only partial views are available. Moreover, our pose alignment strategy enables one to obtain view-specific feature representations shared across classes, which is important to maintain high accuracy in both object categorization and pose estimation. Effectiveness of RotationNet is demonstrated by its superior performance to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · 3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis
