SMGeo: Cross-View Object Geo-Localization with Grid-Level Mixture-of-Experts
Fan Zhang, Haoyuan Ren, Fei Ma, Qiang Yin, Yongsheng Zhou

TL;DR
SMGeo is an innovative transformer-based model that enables real-time, interactive cross-view object geo-localization by adaptively leveraging grid-level experts, significantly outperforming previous methods in accuracy.
Contribution
The paper introduces SMGeo, a novel end-to-end transformer model with grid-level sparse Mixture-of-Experts for improved cross-view geo-localization accuracy and real-time interactive capabilities.
Findings
Achieves state-of-the-art accuracy on drone-to-satellite localization tasks.
Outperforms existing methods like DetGeo in key metrics.
Demonstrates effectiveness of grid-level MoE and anchor-free detection in this context.
Abstract
Cross-view object Geo-localization aims to precisely pinpoint the same object across large-scale satellite imagery based on drone images. Due to significant differences in viewpoint and scale, coupled with complex background interference, traditional multi-stage "retrieval-matching" pipelines are prone to cumulative errors. To address this, we present SMGeo, a promptable end-to-end transformer-based model for object Geo-localization. This model supports click prompting and can output object Geo-localization in real time when prompted to allow for interactive use. The model employs a fully transformer-based architecture, utilizing a Swin-Transformer for joint feature encoding of both drone and satellite imagery and an anchor-free transformer detection head for coordinate regression. In order to better capture both inter-modal and intra-view dependencies, we introduce a grid-level sparse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Remote-Sensing Image Classification
