Robust Drone-View Geo-Localization via Content-Viewpoint Disentanglement

Ke Li; Di Wang; Xiaowei Wang; Zhihong Wu; Yiming Zhang; Yifeng Wang; Quan Wang

arXiv:2505.11822·cs.CV·November 18, 2025

Robust Drone-View Geo-Localization via Content-Viewpoint Disentanglement

Ke Li, Di Wang, Xiaowei Wang, Zhihong Wu, Yiming Zhang, Yifeng Wang, Quan Wang

PDF

Open Access

TL;DR

This paper introduces CVD, a novel framework for drone-view geo-localization that disentangles content and viewpoint features to improve robustness and accuracy across diverse scenarios and viewpoints.

Contribution

The paper proposes a content-viewpoint disentanglement framework with mutual information and reconstruction constraints, enhancing cross-view localization performance.

Findings

01

CVD improves localization accuracy across multiple datasets.

02

The disentanglement approach enhances robustness to viewpoint variations.

03

CVD integrates seamlessly into existing pipelines, reducing inference latency.

Abstract

Drone-view geo-localization (DVGL) aims to match images of the same geographic location captured from drone and satellite perspectives. Despite recent advances, DVGL remains challenging due to significant appearance changes and spatial distortions caused by viewpoint variations. Existing methods typically assume that drone and satellite images can be directly aligned in a shared feature space via contrastive learning. Nonetheless, this assumption overlooks the inherent conflicts induced by viewpoint discrepancies, resulting in extracted features containing inconsistent information that hinders precise localization. In this study, we take a manifold learning perspective and model $the feature space of cross-view images as a composite manifold jointly governed by content and viewpoint$ . Building upon this insight, we propose $CVD$ , a new DVGL framework that explicitly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsUAV Applications and Optimization · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization