Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery
Chi-Nguyen Tran, Dao Sy Duy Minh, Huynh Trung Kiet, Nguyen Lam Phu Quy, Phu-Hoa Pham, Long Tran-Thanh

TL;DR
This paper introduces SkyPart, a lightweight, prototype-based semantic part discovery method for cross-view geo-localization that improves robustness to weather and altitude variations, setting new state-of-the-art results.
Contribution
SkyPart is a novel patch grouping head for vision transformers that explicitly models semantic parts and incorporates altitude-conditioned modulation and uncertainty-weighted loss.
Findings
SkyPart achieves state-of-the-art accuracy on multiple datasets.
It is the smallest among top-performing methods with 26.95M parameters.
SkyPart maintains high performance under weather-related corruptions.
Abstract
Cross-view geo-localization (CVGL), which matches an oblique drone view to a geo-referenced satellite tile, has emerged as a key alternative for autonomous drone navigation when GNSS signals are jammed, spoofed, or unavailable. Despite strong recent progress, three limitations persist: (1) global-descriptor designs compress the patch grid into a single vector without separating layout from texture across the view gap; (2) altitude-related scale variation is retained in the learned embedding rather than marginalized; and (3) multi-objective training relies on hand-tuned scalars over losses on incompatible gradient scales. We propose SkyPart, a lightweight swappable head for patch-based vision transformers (ViTs) that institutes explicit part grouping over the patch grid. SkyPart has four theory-grounded components: (i) learnable prototypes competing for patch tokens via single-pass…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
