Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection

Ji Du; Xin Wang; Fangwei Hao; Mingyang Yu; Chunyuan Chen; Jiesheng Wu; Bin Wang; Jing Xu; Ping Li

arXiv:2510.18437·cs.CV·October 22, 2025

Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection

Ji Du, Xin Wang, Fangwei Hao, Mingyang Yu, Chunyuan Chen, Jiesheng Wu, Bin Wang, Jing Xu, Ping Li

PDF

Open Access

TL;DR

This paper introduces RISE, a novel unsupervised paradigm for camouflaged object detection that leverages dataset-level context and prototype libraries to generate pseudo-labels without annotations, significantly improving detection performance.

Contribution

The paper proposes RISE, a self-augmented unsupervised framework that constructs prototype libraries and uses multi-view KNN retrieval to enhance camouflaged object detection without requiring annotations.

Findings

01

RISE outperforms state-of-the-art unsupervised methods.

02

The clustering-then-retrieval strategy improves prototype quality.

03

Multi-view KNN retrieval enhances pseudo-mask robustness.

Abstract

At the core of Camouflaged Object Detection (COD) lies segmenting objects from their highly similar surroundings. Previous efforts navigate this challenge primarily through image-level modeling or annotation-based optimization. Despite advancing considerably, this commonplace practice hardly taps valuable dataset-level contextual information or relies on laborious annotations. In this paper, we propose RISE, a RetrIeval SElf-augmented paradigm that exploits the entire training dataset to generate pseudo-labels for single images, which could be used to train COD models. RISE begins by constructing prototype libraries for environments and camouflaged objects using training images (without ground truth), followed by K-Nearest Neighbor (KNN) retrieval to generate pseudo-masks for each image based on these libraries. It is important to recognize that using only training images without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques