Multi-Level Embedding and Alignment Network with Consistency and Invariance Learning for Cross-View Geo-Localization

Zhongwei Chen; Zhao-Xu Yang; Hai-Jun Rong

arXiv:2412.14819·cs.CV·May 27, 2025

Multi-Level Embedding and Alignment Network with Consistency and Invariance Learning for Cross-View Geo-Localization

Zhongwei Chen, Zhao-Xu Yang, Hai-Jun Rong

PDF

Open Access 1 Repo

TL;DR

This paper introduces MEAN, a lightweight multi-level embedding and alignment network that improves cross-view geo-localization by learning robust, invariant features with reduced computational costs, outperforming existing models.

Contribution

The paper presents a novel lightweight network with multi-level enhancement and cross-domain alignment for effective cross-view feature matching in geo-localization.

Findings

01

Reduces model parameters by 62.17%.

02

Decreases computational complexity by 70.99%.

03

Achieves competitive or superior localization accuracy.

Abstract

Cross-View Geo-Localization (CVGL) involves determining the localization of drone images by retrieving the most similar GPS-tagged satellite images. However, the imaging gaps between platforms are often significant and the variations in viewpoints are substantial, which limits the ability of existing methods to effectively associate cross-view features and extract consistent and invariant characteristics. Moreover, existing methods often overlook the problem of increased computational and storage requirements when improving model performance. To handle these limitations, we propose a lightweight enhanced alignment network, called the Multi-Level Embedding and Alignment Network (MEAN). The MEAN network uses a progressive multi-level enhancement strategy, global-to-local associations, and cross-domain alignment, enabling feature communication across levels. This allows MEAN to effectively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ischenawei/mean
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques