Robust Remote Sensing Image-Text Retrieval with Noisy Correspondence

Qiya Song; Yiqiang Xie; Yuan Sun; Renwei Dian; Xudong Kang

arXiv:2603.28134·cs.CV·March 31, 2026

Robust Remote Sensing Image-Text Retrieval with Noisy Correspondence

Qiya Song, Yiqiang Xie, Yuan Sun, Renwei Dian, Xudong Kang

PDF

1 Repo

TL;DR

This paper introduces a robust method for remote sensing image-text retrieval that effectively handles noisy data by employing a self-paced learning strategy and a robust triplet loss, significantly improving performance on benchmark datasets.

Contribution

The paper proposes a novel RRSITR framework with a self-paced learning approach and a robust triplet loss to address noisy correspondence in remote sensing image-text retrieval.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets.

02

Effectively handles high noise rates in data.

03

Improves retrieval accuracy with noisy and mismatched data.

Abstract

As a pivotal task that bridges remote visual and linguistic understanding, Remote Sensing Image-Text Retrieval (RSITR) has attracted considerable research interest in recent years. However, almost all RSITR methods implicitly assume that image-text pairs are matched perfectly. In practice, acquiring a large set of well-aligned data pairs is often prohibitively expensive or even infeasible. In addition, we also notice that the remote sensing datasets (e.g., RSITMD) truly contain some inaccurate or mismatched image text descriptions. Based on the above observations, we reveal an important but untouched problem in RSITR, i.e., Noisy Correspondence (NC). To overcome these challenges, we propose a novel Robust Remote Sensing Image-Text Retrieval (RRSITR) paradigm that designs a self-paced learning strategy to mimic human cognitive learning patterns, thereby learning from easy to hard from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MSFLabX/RRSITR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.