VAGeo: View-specific Attention for Cross-View Object Geo-Localization

Zhongyang Li; Xin Yuan; Wei Liu; Xin Xu

arXiv:2501.07194·cs.CV·January 14, 2025

VAGeo: View-specific Attention for Cross-View Object Geo-Localization

Zhongyang Li, Xin Yuan, Wei Liu, Xin Xu

PDF

TL;DR

VAGeo introduces view-specific positional encoding and hybrid attention modules to improve cross-view object geo-localization accuracy by addressing viewpoint discrepancies and enhancing feature discrimination.

Contribution

The paper proposes a novel VAGeo method with view-specific positional encoding and hybrid attention modules for more accurate cross-view geo-localization.

Findings

01

Significant performance improvements on CVOGL dataset.

02

Increased accuracy at different thresholds for ground-view and drone-view images.

03

Effective handling of viewpoint discrepancies in object localization.

Abstract

Cross-view object geo-localization (CVOGL) aims to locate an object of interest in a captured ground- or drone-view image within the satellite image. However, existing works treat ground-view and drone-view query images equivalently, overlooking their inherent viewpoint discrepancies and the spatial correlation between the query image and the satellite-view reference image. To this end, this paper proposes a novel View-specific Attention Geo-localization method (VAGeo) for accurate CVOGL. Specifically, VAGeo contains two key modules: view-specific positional encoding (VSPE) module and channel-spatial hybrid attention (CSHA) module. In object-level, according to the characteristics of different viewpoints of ground and drone query images, viewpoint-specific positional codings are designed to more accurately identify the click-point object of the query image in the VSPE module. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need