KGpose: Keypoint-Graph Driven End-to-End Multi-Object 6D Pose Estimation via Point-Wise Pose Voting
Andrew Jeong

TL;DR
KGpose is an innovative end-to-end framework that estimates multiple objects' 6D poses by combining keypoint graphs and point-wise voting, improving efficiency and accuracy in complex scenes.
Contribution
It introduces a novel keypoint-graph representation and integrates it with point-wise pose voting for multi-object 6D pose estimation.
Findings
Achieves competitive results on benchmark datasets.
Enables multi-object pose estimation without extra localization steps.
Provides a unified approach for geometric understanding in scenes.
Abstract
This letter presents KGpose, a novel end-to-end framework for 6D pose estimation of multiple objects. Our approach combines keypoint-based method with learnable pose regression through `keypoint-graph', which is a graph representation of the keypoints. KGpose first estimates 3D keypoints for each object using an attentional multi-modal feature fusion of RGB and point cloud features. These keypoints are estimated from each point of point cloud and converted into a graph representation. The network directly regresses 6D pose parameters for each point through a sequence of keypoint-graph embedding and local graph embedding which are designed with graph convolutions, followed by rotation and translation heads. The final pose for each object is selected from the candidates of point-wise predictions. The method achieves competitive results on the benchmark dataset, demonstrating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotic Mechanisms and Dynamics · Human Motion and Animation
