Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and   Local Information

Zhiqiang Yuan; Wenkai Zhang; Changyuan Tian; Xuee Rong; Zhengyuan; Zhang; Hongqi Wang; Kun Fu; and Xian Sun

arXiv:2204.09860·cs.CV·April 22, 2022

Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information

Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Xuee Rong, Zhengyuan, Zhang, Hongqi Wang, Kun Fu, and Xian Sun

PDF

1 Repo

TL;DR

This paper introduces a novel cross-modal remote sensing text-image retrieval framework that effectively combines global and local features, improving retrieval accuracy by dynamic feature fusion and enhanced local representations.

Contribution

The paper proposes GaLR, a new RSCTIR framework with a multi-level dynamic fusion module and enhanced local feature extraction using DREA, achieving state-of-the-art results.

Findings

01

GaLR outperforms existing methods on public datasets.

02

The DREA module improves local feature quality.

03

The multivariate rerank algorithm enhances retrieval precision.

Abstract

Cross-modal remote sensing text-image retrieval (RSCTIR) has recently become an urgent research hotspot due to its ability of enabling fast and flexible information extraction on remote sensing (RS) images. However, current RSCTIR methods mainly focus on global features of RS images, which leads to the neglect of local features that reflect target relationships and saliency. In this article, we first propose a novel RSCTIR framework based on global and local information (GaLR), and design a multi-level information dynamic fusion (MIDF) module to efficaciously integrate features of different levels. MIDF leverages local information to correct global information, utilizes global information to supplement local information, and uses the dynamic addition of the two to generate prominent visual representation. To alleviate the pressure of the redundant targets on the graph convolution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaoyuan1996/galr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGraph Convolutional Network · Convolution