Introducing Pose Consistency and Warp-Alignment for Self-Supervised 6D   Object Pose Estimation in Color Images

Juil Sock; Guillermo Garcia-Hernando; Anil Armagan; Tae-Kyun Kim

arXiv:2003.12344·cs.CV·October 19, 2020·1 cites

Introducing Pose Consistency and Warp-Alignment for Self-Supervised 6D Object Pose Estimation in Color Images

Juil Sock, Guillermo Garcia-Hernando, Anil Armagan, Tae-Kyun Kim

PDF

Open Access

TL;DR

This paper introduces a self-supervised framework for 6D object pose estimation that enhances generalization from synthetic data without requiring real-world pose annotations, using pose and photometric consistency techniques.

Contribution

It proposes a two-stage self-supervised approach that improves 6D pose estimation by enforcing pose and photometric consistency, applicable on top of existing neural networks without real image annotations.

Findings

01

Achieves state-of-the-art results on LINEMOD, LINEMOD OCCLUSION, and HomebrewedDB datasets.

02

Outperforms methods trained only on synthetic data and domain adaptation baselines.

03

Effective in real-world scenarios without requiring pose annotations or depth information.

Abstract

Most successful approaches to estimate the 6D pose of an object typically train a neural network by supervising the learning with annotated poses in real world images. These annotations are generally expensive to obtain and a common workaround is to generate and train on synthetic scenes, with the drawback of limited generalisation when the model is deployed in the real world. In this work, a two-stage 6D object pose estimator framework that can be applied on top of existing neural-network-based approaches and that does not require pose annotations on real images is proposed. The first self-supervised stage enforces the pose consistency between rendered predictions and real input images, narrowing the gap between the two domains. The second stage fine-tunes the previously trained model by enforcing the photometric consistency between pairs of different object views, where one image is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Robotics and Sensor-Based Localization