Robust Drone-View Geo-Localization via Content-Viewpoint Disentanglement
Ke Li, Di Wang, Xiaowei Wang, Zhihong Wu, Yiming Zhang, Yifeng Wang, Quan Wang

TL;DR
This paper introduces CVD, a novel framework for drone-view geo-localization that disentangles content and viewpoint features to improve robustness and accuracy across diverse scenarios and viewpoints.
Contribution
The paper proposes a content-viewpoint disentanglement framework with mutual information and reconstruction constraints, enhancing cross-view localization performance.
Findings
CVD improves localization accuracy across multiple datasets.
The disentanglement approach enhances robustness to viewpoint variations.
CVD integrates seamlessly into existing pipelines, reducing inference latency.
Abstract
Drone-view geo-localization (DVGL) aims to match images of the same geographic location captured from drone and satellite perspectives. Despite recent advances, DVGL remains challenging due to significant appearance changes and spatial distortions caused by viewpoint variations. Existing methods typically assume that drone and satellite images can be directly aligned in a shared feature space via contrastive learning. Nonetheless, this assumption overlooks the inherent conflicts induced by viewpoint discrepancies, resulting in extracted features containing inconsistent information that hinders precise localization. In this study, we take a manifold learning perspective and model . Building upon this insight, we propose , a new DVGL framework that explicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUAV Applications and Optimization · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization
