SkyLink: A Large Vision-Language Model Driven Re-ranking Framework for Cross-View UAV geolocalization
Bowen Liu, Pengyue Jia, Wanyu Wang, Derong Xu, Jiawei Cheng, Jiancheng Dong, Xiao Han, Zimo Zhao, Chao Zhang, Bowen Yu, Fangyu Hong, Xiangyu Zhao

TL;DR
SkyLink introduces a novel framework utilizing a large vision-language model to jointly model cross-view relationships, significantly improving UAV geolocalization accuracy in large-scale image retrieval tasks.
Contribution
The paper presents SkyLink, a plug-and-play ranking framework that models inter-view relationships with a vision-language model and a relational-aware loss, advancing cross-view UAV geolocalization.
Findings
SkyLink significantly improves ranking performance across multiple datasets.
The relational-aware loss enhances training stability and discriminative ability.
Extensive experiments show consistent superiority over existing methods.
Abstract
Cross-view UAV geolocalization is fundamentally a challenging large-scale image retrieval task, aiming to determine the geographic coordinates of Unmanned Aerial Vehicle (UAV) queries by matching them against an extensive geo-tagged satellite image database. Most existing methods learn separate feature representations for each view and determine the final prediction using naive heuristics to assess feature similarity, thereby neglecting to model the crucial cross-view relationships. In this paper, we propose SkyLink, a novel plug-and-play ranking framework that pioneers joint relational modeling of inter-view relationships to enhance cross-view UAV geolocalization. SkyLink leverages a Large Vision-Language Model (LVLM) to model the intricate visual-semantic relationships between UAV and satellite views, facilitating effective cross-view matching. To further refine the learning process,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUAV Applications and Optimization · Advanced Image and Video Retrieval Techniques · Remote-Sensing Image Classification
