SSFam: Scribble Supervised Salient Object Detection Family
Zhengyi Liu, Sheng Deng, Xinrui Wang, Linbo Wang, Xianyong Fang, Bin, Tang

TL;DR
This paper introduces SSFam, a novel family of scribble supervised salient object detection models leveraging the Segment Anything Model (SAM) and multi-modal inputs to improve segmentation accuracy in complex scenes.
Contribution
The paper proposes a new SSFam framework that integrates modal-aware modulators and a siamese decoder with SAM for enhanced scribble supervised salient object detection across multiple modalities.
Findings
Achieves state-of-the-art performance among scribble supervised methods.
Effectively combines RGB, depth, and thermal modalities for improved segmentation.
Close to fully supervised methods in complex scene segmentation.
Abstract
Scribble supervised salient object detection (SSSOD) constructs segmentation ability of attractive objects from surroundings under the supervision of sparse scribble labels. For the better segmentation, depth and thermal infrared modalities serve as the supplement to RGB images in the complex scenes. Existing methods specifically design various feature extraction and multi-modal fusion strategies for RGB, RGB-Depth, RGB-Thermal, and Visual-Depth-Thermal image input respectively, leading to similar model flood. As the recently proposed Segment Anything Model (SAM) possesses extraordinary segmentation and prompt interactive capability, we propose an SSSOD family based on SAM, named SSFam, for the combination input with different modalities. Firstly, different modal-aware modulators are designed to attain modal-specific knowledge which cooperates with modal-agnostic information extracted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Advanced Image Fusion Techniques
MethodsSegment Anything Model
