PoseCrafter: Extreme Pose Estimation with Hybrid Video Synthesis
Qing Mao, Tianxin Huang, Yu Zhu, Jinqiu Sun, Yanning Zhang, Gim Hee Lee

TL;DR
PoseCrafter introduces a hybrid video synthesis approach combining interpolation and pose-conditioned view synthesis, along with a feature matching selector, to improve camera pose estimation in challenging scenarios with minimal or no image overlap.
Contribution
It proposes a novel hybrid video generation method and a feature matching selector to enhance pose estimation accuracy in difficult cases with sparse or no overlap.
Findings
Significantly improves pose estimation on challenging datasets.
Outperforms existing state-of-the-art methods.
Effective in cases with minimal or no image overlap.
Abstract
Pairwise camera pose estimation from sparsely overlapping image pairs remains a critical and unsolved challenge in 3D vision. Most existing methods struggle with image pairs that have small or no overlap. Recent approaches attempt to address this by synthesizing intermediate frames using video interpolation and selecting key frames via a self-consistency score. However, the generated frames are often blurry due to small overlap inputs, and the selection strategies are slow and not explicitly aligned with pose estimation. To solve these cases, we propose Hybrid Video Generation (HVG) to synthesize clearer intermediate frames by coupling a video interpolation model with a pose-conditioned novel view synthesis model, where we also propose a Feature Matching Selector (FMS) based on feature correspondence to select intermediate frames appropriate for pose estimation from the synthesized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Robot Manipulation and Learning · Robotics and Sensor-Based Localization
