ST-SAM: SAM-Driven Self-Training Framework for Semi-Supervised Camouflaged Object Detection

Xihang Hu; Fuming Sun; Jiazhe Liu; Feilong Xu; Xiaoli Zhang

arXiv:2507.23307·cs.CV·August 1, 2025

ST-SAM: SAM-Driven Self-Training Framework for Semi-Supervised Camouflaged Object Detection

Xihang Hu, Fuming Sun, Jiazhe Liu, Feilong Xu, Xiaoli Zhang

PDF

TL;DR

ST-SAM introduces a self-training framework that leverages the Segment Anything Model with hybrid prompts to achieve state-of-the-art semi-supervised camouflaged object detection using only 1% labeled data.

Contribution

It proposes a novel, annotation-efficient self-training method that transforms pseudo-labels into hybrid prompts, enabling a single-model architecture to outperform existing methods.

Findings

01

Achieves state-of-the-art results with only 1% labeled data.

02

Outperforms existing SSCOD methods and matches fully supervised performance.

03

Requires training only a single network without specific models or loss functions.

Abstract

Semi-supervised Camouflaged Object Detection (SSCOD) aims to reduce reliance on costly pixel-level annotations by leveraging limited annotated data and abundant unlabeled data. However, existing SSCOD methods based on Teacher-Student frameworks suffer from severe prediction bias and error propagation under scarce supervision, while their multi-network architectures incur high computational overhead and limited scalability. To overcome these limitations, we propose ST-SAM, a highly annotation-efficient yet concise framework that breaks away from conventional SSCOD constraints. Specifically, ST-SAM employs Self-Training strategy that dynamically filters and expands high-confidence pseudo-labels to enhance a single-model architecture, thereby fundamentally circumventing inter-model prediction bias. Furthermore, by transforming pseudo-labels into hybrid prompts containing domain-specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.