ViewBridge:Revisiting Cross-View Localization from Image Matching
Panwang Xia, Qiong Wu, Lei Yu, Yi Liu, Mingtao Xiong, Xudong Lu, Yi Liu, Haoyu Guo, Yongxiang Yao, Junjian Zhang, Xiangyuan Cai, Hongwei Hu, Zhi Zheng, Yongjun Zhang, Yi Wan

TL;DR
This paper introduces a new framework for cross-view localization that improves fine-grained image matching and pose estimation by incorporating geometric constraints and adaptive similarity refinement, supported by a large annotated benchmark.
Contribution
It proposes a Surface Model and a SimRefiner to enhance matching accuracy and geometric consistency in cross-view localization, along with the first large-scale pixel-level annotated benchmark.
Findings
Achieves geometry-consistent, fine-grained correspondences across extreme viewpoints.
Improves localization accuracy and stability.
Provides a new benchmark with extensive annotations.
Abstract
Cross-view localization aims to estimate the 3-DoF pose of a ground-view image by aligning it with aerial or satellite imagery. Existing methods typically address this task through direct regression or feature alignment in a shared bird's-eye view (BEV) space. Although effective for coarse alignment, these methods fail to establish fine-grained and geometrically reliable correspondences under large viewpoint variations, thereby limiting both the accuracy and interpretability of localization results. Consequently, we revisit cross-view localization from the perspective of image matching and propose a unified framework that enhances both matching and localization. Specifically, we introduce a Surface Model that constrains BEV feature projection to physically valid regions for geometric consistency, and a SimRefiner that adaptively refines similarity distributions to enhance match…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
