Estimating 6D Pose From Localizing Designated Surface Keypoints
Zelin Zhao, Gao Peng, Haoyu Wang, Hao-Shu Fang, Chengkun Li, Cewu Lu

TL;DR
This paper introduces a surface keypoint-based method for 6D pose estimation from RGB images that achieves high accuracy without post-processing, handles occlusion well, and improves ADD accuracy significantly.
Contribution
It proposes a novel keypoint detection approach on object surfaces for 6D pose estimation that outperforms existing CNN-based methods without requiring refinement.
Findings
Achieves 30% relative improvement in ADD accuracy without refinement
Handles heavy occlusion effectively by selecting confident keypoints
Provides competitive accuracy with simpler, faster processing
Abstract
In this paper, we present an accurate yet effective solution for 6D pose estimation from an RGB image. The core of our approach is that we first designate a set of surface points on target object model as keypoints and then train a keypoint detector (KPD) to localize them. Finally a PnP algorithm can recover the 6D pose according to the 2D-3D relationship of keypoints. Different from recent state-of-the-art CNN-based approaches that rely on a time-consuming post-processing procedure, our method can achieve competitive accuracy without any refinement after pose prediction. Meanwhile, we obtain a 30% relative improvement in terms of ADD accuracy among methods without using refinement. Moreover, we succeed in handling heavy occlusion by selecting the most confident keypoints to recover the 6D pose. For the sake of reproducibility, we will make our code and models publicly available soon.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Robot Manipulation and Learning · Human Pose and Action Recognition
