6DoF Object Pose Estimation via Differentiable Proxy Voting Loss
Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li

TL;DR
This paper introduces a differentiable proxy voting loss for 6DOF object pose estimation that accounts for pixel-to-keypoint distances, improving accuracy and training efficiency.
Contribution
It proposes a novel differentiable loss function that enhances vector-field based keypoint voting by incorporating pixel-keypoint distances, enabling end-to-end training.
Findings
Significant improvement in pose estimation accuracy on LINEMOD datasets.
Faster training convergence compared to previous methods.
Effective handling of occlusions and textureless objects.
Abstract
Estimating a 6DOF object pose from a single image is very challenging due to occlusions or textureless appearances. Vector-field based keypoint voting has demonstrated its effectiveness and superiority on tackling those issues. However, direct regression of vector-fields neglects that the distances between pixels and keypoints also affect the deviations of hypotheses dramatically. In other words, small errors in direction vectors may generate severely deviated hypotheses when pixels are far away from a keypoint. In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective. To this end, we develop a simple yet effective differentiable proxy voting loss (DPVL) which mimics the hypothesis selection in the voting procedure. By exploiting our voting loss, we are able to train our network in an end-to-end manner. Experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Advanced Neural Network Applications
