WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
Jiahao Wen, Hang Yu, Zhedong Zheng

TL;DR
WeatherPrompt introduces a multi-modality learning approach that enhances drone visual geo-localization across diverse weather conditions by synthesizing weather descriptions and disentangling scene-weather features, leading to improved accuracy.
Contribution
The paper proposes a novel multi-modality framework with a training-free weather reasoning mechanism and dynamic feature fusion, advancing weather-invariant drone geo-localization.
Findings
Achieves higher recall rates under diverse weather conditions.
Improves Recall@1 by +13.37% in night conditions.
Enhances performance under fog and snow by 18.69%.
Abstract
Visual geo-localization for drones faces critical degradation under weather perturbations, \eg, rain and fog, where existing methods struggle with two inherent limitations: 1) Heavy reliance on limited weather categories that constrain generalization, and 2) Suboptimal disentanglement of entangled scene-weather features through pseudo weather categories. We present WeatherPrompt, a multi-modality learning paradigm that establishes weather-invariant representations through fusing the image embedding with the text context. Our framework introduces two key contributions: First, a Training-free Weather Reasoning mechanism that employs off-the-shelf large multi-modality models to synthesize multi-weather textual descriptions through human-like reasoning. It improves the scalability to unseen or complex weather, and could reflect different weather strength. Second, to better disentangle the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
