ZS-VCOS: Zero-Shot Video Camouflaged Object Segmentation By Optical Flow and Open Vocabulary Object Detection
Wenqi Guo, Mohamed Shehata, Shan Du

TL;DR
This paper introduces a zero-shot video camouflaged object segmentation method that leverages large pre-trained models and optical flow, significantly outperforming existing methods without training on task-specific data.
Contribution
It presents a novel zero-shot segmentation pipeline integrating SAM-2, Owl-v2, and temporal cues, achieving state-of-the-art results on multiple datasets.
Findings
Outperforms existing zero-shot methods with F-measure of 0.628
Surpasses supervised methods with F-measure of 0.628
Increases success rate on MoCA-Filter dataset from 0.628 to 0.697
Abstract
Camouflaged object segmentation presents unique challenges compared to traditional segmentation tasks, primarily due to the high similarity in patterns and colors between camouflaged objects and their backgrounds. Effective solutions to this problem have significant implications in critical areas such as pest control, defect detection, and lesion segmentation in medical imaging. Prior research has predominantly emphasized supervised or unsupervised pre-training methods, leaving zero-shot approaches significantly underdeveloped. Existing zero-shot techniques commonly utilize the Segment Anything Model (SAM) in automatic mode or rely on vision-language models to generate cues for segmentation; however, their performances remain unsatisfactory, due to the similarity of the camouflaged object and the background. This work studies how to avoid training by integrating large pre-trained models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Multimodal Machine Learning Applications · Social Robot Interaction and HRI
MethodsSegment Anything Model
