WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization

Jiahao Wen; Hang Yu; Zhedong Zheng

arXiv:2508.09560·cs.CV·December 5, 2025

WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization

Jiahao Wen, Hang Yu, Zhedong Zheng

PDF

TL;DR

WeatherPrompt introduces a multi-modality learning approach that enhances drone visual geo-localization across diverse weather conditions by synthesizing weather descriptions and disentangling scene-weather features, leading to improved accuracy.

Contribution

The paper proposes a novel multi-modality framework with a training-free weather reasoning mechanism and dynamic feature fusion, advancing weather-invariant drone geo-localization.

Findings

01

Achieves higher recall rates under diverse weather conditions.

02

Improves Recall@1 by +13.37% in night conditions.

03

Enhances performance under fog and snow by 18.69%.

Abstract

Visual geo-localization for drones faces critical degradation under weather perturbations, \eg, rain and fog, where existing methods struggle with two inherent limitations: 1) Heavy reliance on limited weather categories that constrain generalization, and 2) Suboptimal disentanglement of entangled scene-weather features through pseudo weather categories. We present WeatherPrompt, a multi-modality learning paradigm that establishes weather-invariant representations through fusing the image embedding with the text context. Our framework introduces two key contributions: First, a Training-free Weather Reasoning mechanism that employs off-the-shelf large multi-modality models to synthesize multi-weather textual descriptions through human-like reasoning. It improves the scalability to unseen or complex weather, and could reflect different weather strength. Second, to better disentangle the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.