TL;DR
This paper introduces a transformer-based feature segmentation and region alignment method called FSRA for UAV-view geo-localization, improving robustness to position shifts and scale changes in cross-view matching tasks.
Contribution
The paper proposes a novel FSRA method that automatically segments and aligns features based on heat distribution without extra supervision, enhancing cross-view geo-localization accuracy.
Findings
Achieves state-of-the-art performance in UAV-view geo-localization tasks.
Effectively handles significant shifts and scale changes in images.
Demonstrates superior results in drone target localization and navigation.
Abstract
Cross-view geo-localization is a task of matching the same geographic image from different views, e.g., unmanned aerial vehicle (UAV) and satellite. The most difficult challenges are the position shift and the uncertainty of distance and scale. Existing methods are mainly aimed at digging for more comprehensive fine-grained information. However, it underestimates the importance of extracting robust feature representation and the impact of feature alignment. The CNN-based methods have achieved great success in cross-view geo-localization. However it still has some limitations, e.g., it can only extract part of the information in the neighborhood and some scale reduction operations will make some fine-grained information lost. In particular, we introduce a simple and efficient transformer-based structure called Feature Segmentation and Region Alignment (FSRA) to enhance the model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
