Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction
Yuhao Yang, Jun Wu, Yue Wang, Guangjian Zhang, Rong Xiong

TL;DR
This paper introduces a bidirectional prediction network with point-wise attention for 6D object pose estimation, improving robustness especially under occlusion by modeling geometric similarities and correlations.
Contribution
It proposes a novel bidirectional correspondence prediction network with a point-wise attention mechanism and a pseudo-siamese network to enhance feature homogeneity and pose estimation accuracy.
Findings
Outperforms state-of-the-art methods on LineMOD, YCB-Video, and Occ-LineMOD datasets.
Shows significant robustness improvements under severe occlusion conditions.
Achieves higher accuracy in 6D pose estimation tasks.
Abstract
Traditional geometric registration based estimation methods only exploit the CAD model implicitly, which leads to their dependence on observation quality and deficiency to occlusion. To address the problem,the paper proposes a bidirectional correspondence prediction network with a point-wise attention-aware mechanism. This network not only requires the model points to predict the correspondence but also explicitly models the geometric similarities between observations and the model prior. Our key insight is that the correlations between each model point and scene point provide essential information for learning point-pair matches. To further tackle the correlation noises brought by feature distribution divergence, we design a simple but effective pseudo-siamese network to improve feature homogeneity. Experimental results on the public datasets of LineMOD, YCB-Video, and Occ-LineMOD show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Human Pose and Action Recognition · Advanced Neural Network Applications
