TL;DR
This paper introduces RHO, a robust model for cross-view geo-localization using panoramic images and OpenStreetMap, supported by a large-scale dataset and novel modules for distortion correction and pose fusion.
Contribution
The work presents a new holistic approach with a large dataset, novel modules, and significant performance improvements for cross-view geo-localization.
Findings
The CV-RHO dataset contains over 2.7 million images under diverse conditions.
The RHO model achieves up to 20% performance gain over existing methods.
The proposed modules effectively address panoramic distortion and improve localization accuracy.
Abstract
Metric Cross-View Geo-Localization (MCVGL) aims to estimate the 3-DoF camera pose (position and heading) by matching ground and satellite images. In this work, instead of pinhole and satellite images, we study robust MCVGL using holistic panoramas and OpenStreetMap (OSM). To this end, we establish a large-scale MCVGL benchmark dataset, CV-RHO, with over 2.7M images under different weather and lighting conditions, as well as sensor noise. Furthermore, we propose a model termed RHO with a two-branch Pin-Pan architecture for accurate visual localization. A Split-Undistort-Merge (SUM) module is introduced to address the panoramic distortion, and a Position-Orientation Fusion (POF) mechanism is designed to enhance the localization accuracy. Extensive experiments prove the value of our CV-RHO dataset and the effectiveness of the RHO model, with a significant performance gain up to 20%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
