Camouflage-aware Image-Text Retrieval via Expert Collaboration

Yao Jiang; Zhongkuan Mao; Xuan Wu; Keren Fu; Qijun Zhao

arXiv:2604.01251·cs.CV·April 3, 2026

Camouflage-aware Image-Text Retrieval via Expert Collaboration

Yao Jiang, Zhongkuan Mao, Xuan Wu, Keren Fu, Qijun Zhao

PDF

1 Repo

TL;DR

This paper introduces a new camouflaged image-text retrieval task, creates a dedicated dataset, and proposes a collaborative network with a novel attention mechanism to improve retrieval accuracy in camouflaged scenarios.

Contribution

It formulates the first camouflaged image-text retrieval task, constructs a specialized dataset, and develops a novel collaborative network with confidence-conditioned graph attention.

Findings

01

CECNet achieves approximately 29% improvement in overall CA-ITR accuracy.

02

Benchmark results highlight the challenges posed by camouflage properties.

03

The proposed method surpasses seven existing retrieval models.

Abstract

Camouflaged scene understanding (CSU) has attracted significant attention due to its broad practical implications. However, in this field, robust image-text cross-modal alignment remains under-explored, hindering deeper understanding of camouflaged scenarios and their related applications. To this end, we focus on the typical image-text retrieval task, and formulate a new task dubbed ``camouflage-aware image-text retrieval'' (CA-ITR). We first construct a dedicated camouflage image-text retrieval dataset (CamoIT), comprising $\sim$ 10.5K samples with multi-granularity textual annotations. Benchmark results conducted on CamoIT reveal the underlying challenges of CA-ITR for existing cutting-edge retrieval techniques, which are mainly caused by objects' camouflage properties as well as those complex image contents. As a solution, we propose a camouflage-expert collaborative network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiangyao-scu/CA-ITR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.