3D Point-to-Keypoint Voting Network for 6D Pose Estimation
Weitong Hua, Jiaxin Guo, Yue Wang, Rong Xiong

TL;DR
This paper introduces a 3D point-to-keypoint voting network that leverages spatial relationships in RGB-D data to improve 6D pose estimation, especially under occlusion and clutter, achieving state-of-the-art accuracy.
Contribution
It proposes a novel framework utilizing spatial structure of 3D keypoints and a point-wise dense feature voting mechanism for robust 6D pose estimation from RGB-D data.
Findings
Achieves 98.7% ADD(-S) accuracy on LINEMOD dataset.
Attains 52.6% accuracy on OCCLUSION LINEMOD dataset.
Operates in real-time with superior performance over existing methods.
Abstract
Object 6D pose estimation is an important research topic in the field of computer vision due to its wide application requirements and the challenges brought by complexity and changes in the real-world. We think fully exploring the characteristics of spatial relationship between points will help to improve the pose estimation performance, especially in the scenes of background clutter and partial occlusion. But this information was usually ignored in previous work using RGB image or RGB-D data. In this paper, we propose a framework for 6D pose estimation from RGB-D data based on spatial structure characteristics of 3D keypoints. We adopt point-wise dense feature embedding to vote for 3D keypoints, which makes full use of the structure information of the rigid body. After the direction vectors pointing to the keypoints are predicted by CNN, we use RANSAC voting to calculate the coordinate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Robot Manipulation and Learning · 3D Surveying and Cultural Heritage
