Feature Mapping for Learning Fast and Accurate 3D Pose Inference from Synthetic Images
Mahdi Rad, Markus Oberweger, Vincent Lepetit

TL;DR
This paper introduces a feature mapping approach that leverages synthetic images to train a deep network for fast, accurate 3D pose inference from real images, outperforming existing methods on benchmark datasets.
Contribution
The paper presents a general network-based method that maps features from real to synthetic images, enabling effective training with synthetic data for 3D pose estimation.
Findings
Outperforms state-of-the-art on LINEMOD dataset
Achieves faster inference than exemplar-based methods
Demonstrates high accuracy on NYU hand pose dataset
Abstract
We propose a simple and efficient method for exploiting synthetic images when training a Deep Network to predict a 3D pose from an image. The ability of using synthetic images for training a Deep Network is extremely valuable as it is easy to create a virtually infinite training set made of such images, while capturing and annotating real images can be very cumbersome. However, synthetic images do not resemble real images exactly, and using them for training can result in suboptimal performance. It was recently shown that for exemplar-based approaches, it is possible to learn a mapping from the exemplar representations of real images to the exemplar representations of synthetic images. In this paper, we show that this approach is more general, and that a network can also be applied after the mapping to infer a 3D pose: At run time, given a real image of the target object, we first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Advanced Neural Network Applications
