COG: Confidence-aware Optimal Geometric Correspondence for Unsupervised Single-reference Novel Object Pose Estimation
Yuchen Che, Jingtu Wu, Hao Zheng, Asako Kanezaki

TL;DR
This paper introduces COG, an unsupervised framework for 6DoF object pose estimation that uses confidence-aware optimal transport to find robust correspondences, outperforming existing methods especially with supervision.
Contribution
The paper presents a novel confidence-aware optimal transport approach for correspondence estimation, integrating semantic priors and enabling unsupervised learning for pose estimation.
Findings
Unsupervised COG achieves performance comparable to supervised methods.
Supervised COG surpasses existing approaches in accuracy.
The method effectively handles occlusions and view-point changes.
Abstract
Estimating the 6DoF pose of a novel object with a single reference view is challenging due to occlusions, view-point changes, and outliers. A core difficulty lies in finding robust cross-view correspondences, as existing methods often rely on discrete one-to-one matching that is non-differentiable and tends to collapse onto sparse key-points. We propose Confidence-aware Optimal Geometric Correspondence (COG), an unsupervised framework that formulates correspondence estimation as a confidence-aware optimal transport problem. COG produces balanced soft correspondences by predicting point-wise confidences and injecting them as optimal transport marginals, suppressing non-overlapping regions. Semantic priors from vision foundation models further regularize the correspondences, leading to stable pose estimation. This design integrates confidence into the correspondence finding and pose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Human Pose and Action Recognition
