Object-level Cross-view Geo-localization with Location Enhancement and Multi-Head Cross Attention

Zheyang Huang; Jagannath Aryal; Saeid Nahavandi; Xuequan Lu; Chee Peng Lim; Lei Wei; Hailing Zhou

arXiv:2505.17911·cs.CV·May 26, 2025

Object-level Cross-view Geo-localization with Location Enhancement and Multi-Head Cross Attention

Zheyang Huang, Jagannath Aryal, Saeid Nahavandi, Xuequan Lu, Chee Peng Lim, Lei Wei, Hailing Zhou

PDF

1 Repo

TL;DR

This paper introduces OCGNet, a novel object-level cross-view geo-localization network that leverages location cues, enhancement modules, and multi-head attention to improve accuracy and generalization in challenging scenarios.

Contribution

It proposes a new network architecture that integrates location information and attention mechanisms for precise object-level geo-localization across views.

Findings

01

Achieves state-of-the-art results on CVOGL dataset.

02

Demonstrates effective few-shot learning capabilities.

03

Robustly localizes objects despite viewpoint and condition variations.

Abstract

Cross-view geo-localization determines the location of a query image, captured by a drone or ground-based camera, by matching it to a geo-referenced satellite image. While traditional approaches focus on image-level localization, many applications, such as search-and-rescue, infrastructure inspection, and precision delivery, demand object-level accuracy. This enables users to prompt a specific object with a single click on a drone image to retrieve precise geo-tagged information of the object. However, variations in viewpoints, timing, and imaging conditions pose significant challenges, especially when identifying visually similar objects in extensive satellite imagery. To address these challenges, we propose an Object-level Cross-view Geo-localization Network (OCGNet). It integrates user-specified click locations using Gaussian Kernel Transfer (GKT) to preserve location information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zheyangh/ocgnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need · Focus