Real-Time Object Pose Estimation with Pose Interpreter Networks

Jimmy Wu; Bolei Zhou; Rebecca Russell; Vincent Kee; Syler Wagner,; Mitchell Hebert; Antonio Torralba; David M.S. Johnson

arXiv:1808.01099·cs.RO·January 18, 2019

Real-Time Object Pose Estimation with Pose Interpreter Networks

Jimmy Wu, Bolei Zhou, Rebecca Russell, Vincent Kee, Syler Wagner,, Mitchell Hebert, Antonio Torralba, David M.S. Johnson

PDF

2 Repos

TL;DR

This paper presents a real-time 6-DoF object pose estimation system using pose interpreter networks trained solely on synthetic data, leveraging object masks to bridge the gap between synthetic and real images.

Contribution

The introduction of pose interpreter networks trained on synthetic data with object masks enables real-time pose estimation without real pose annotations.

Findings

01

Achieves 20 Hz real-time performance on live RGB data

02

Successfully generalizes from synthetic to real data using object masks

03

Does not require depth information or ICP refinement

Abstract

In this work, we introduce pose interpreter networks for 6-DoF object pose estimation. In contrast to other CNN-based approaches to pose estimation that require expensively annotated object pose data, our pose interpreter network is trained entirely on synthetic pose data. We use object masks as an intermediate representation to bridge real and synthetic. We show that when combined with a segmentation model trained on RGB images, our synthetically trained pose interpreter network is able to generalize to real data. Our end-to-end system for object pose estimation runs in real-time (20 Hz) on live RGB data, without using depth information or ICP refinement.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.