First RAG, Second SEG: A Training-Free Paradigm for Camouflaged Object Detection

Wutao Liu; YiDan Wang; and Pan Gao

arXiv:2508.15313·cs.CV·September 16, 2025

First RAG, Second SEG: A Training-Free Paradigm for Camouflaged Object Detection

Wutao Liu, YiDan Wang, and Pan Gao

PDF

Open Access

TL;DR

This paper introduces RAG-SEG, a training-free, two-stage paradigm for camouflaged object detection that combines retrieval-augmented mask generation with foundation model segmentation, achieving competitive results without training.

Contribution

It proposes a novel training-free approach that decouples COD into retrieval and segmentation stages, reducing computational costs and eliminating the need for training.

Findings

01

Performs on par or better than state-of-the-art methods on benchmark datasets.

02

Operates efficiently on a personal laptop, demonstrating high practicality.

03

Effectively generates accurate masks without any training process.

Abstract

Camouflaged object detection (COD) poses a significant challenge in computer vision due to the high similarity between objects and their backgrounds. Existing approaches often rely on heavy training and large computational resources. While foundation models such as the Segment Anything Model (SAM) offer strong generalization, they still struggle to handle COD tasks without fine-tuning and require high-quality prompts to yield good performance. However, generating such prompts manually is costly and inefficient. To address these challenges, we propose \textbf{First RAG, Second SEG (RAG-SEG)}, a training-free paradigm that decouples COD into two stages: Retrieval-Augmented Generation (RAG) for generating coarse masks as prompts, followed by SAM-based segmentation (SEG) for refinement. RAG-SEG constructs a compact retrieval database via unsupervised clustering, enabling fast and effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Image Enhancement Techniques · Advanced Image and Video Retrieval Techniques