3D Pose Estimation and 3D Model Retrieval for Objects in the Wild
Alexander Grabner, Peter M. Roth, Vincent Lepetit

TL;DR
This paper introduces a scalable method for 3D pose estimation and model retrieval in natural images, significantly improving accuracy and matching models to human annotations on Pascal3D+ and ShapeNet datasets.
Contribution
It presents a novel 3D pose estimation technique and a CNN-based multi-view metric learning approach for accurate 3D model retrieval from RGB images.
Findings
Outperforms state-of-the-art on Pascal3D+ for pose estimation.
Achieves 50% model matching accuracy with human annotations.
Successfully retrieves detailed 3D models from ShapeNet in real-world scenarios.
Abstract
We propose a scalable, efficient and accurate approach to retrieve 3D models for objects in the wild. Our contribution is twofold. We first present a 3D pose estimation approach for object categories which significantly outperforms the state-of-the-art on Pascal3D+. Second, we use the estimated pose as a prior to retrieve 3D models which accurately represent the geometry of objects in RGB images. For this purpose, we render depth images from 3D models under our predicted pose and match learned image descriptors of RGB images against those of rendered depth images using a CNN-based multi-view metric learning approach. In this way, we are the first to report quantitative results for 3D model retrieval on Pascal3D+, where our method chooses the same models as human annotators for 50% of the validation images on average. In addition, we show that our method, which was trained purely on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
