SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

Yan Di; Fabian Manhardt; Gu Wang; Xiangyang Ji; Nassir Navab and; Federico Tombari

arXiv:2108.08367·cs.CV·August 20, 2021

SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

Yan Di, Fabian Manhardt, Gu Wang, Xiangyang Ji, Nassir Navab and, Federico Tombari

PDF

Open Access 2 Repos

TL;DR

SO-Pose introduces a novel approach that leverages self-occlusion reasoning to improve the accuracy of direct 6D pose estimation from a single RGB image, outperforming existing methods.

Contribution

The paper proposes a two-layer object representation and a fusion framework that enhances end-to-end 6D pose estimation accuracy by incorporating self-occlusion information.

Findings

01

Outperforms state-of-the-art methods on challenging datasets.

02

Achieves higher accuracy in cluttered environments.

03

Demonstrates robustness through cross-layer consistency.

Abstract

Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (e.g. the 3D rotation and translation) in a cluttered environment from a single RGB image is a challenging problem. While end-to-end methods have recently demonstrated promising results at high efficiency, they are still inferior when compared with elaborate P $n$ P/RANSAC-based approaches in terms of pose accuracy. In this work, we address this shortcoming by means of a novel reasoning about self-occlusion, in order to establish a two-layer representation for 3D objects which considerably enhances the accuracy of end-to-end 6D pose estimation. Our framework, named SO-Pose, takes a single RGB image as input and respectively generates 2D-3D correspondences as well as self-occlusion information harnessing a shared encoder and two separate decoders. Both outputs are then fused to directly regress the 6DoF pose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Advanced Vision and Imaging