A Visual Representation-guided Framework with Global Affinity for Weakly Supervised Salient Object Detection
Binwei Xu, Haoran Liang, Weihua Gong, Ronghua Liang, Peng Chen

TL;DR
This paper introduces a visual representation-guided framework with global affinity for weakly supervised salient object detection, leveraging self-supervised learning and semantic affinity loss to improve accuracy without extra labels.
Contribution
It proposes a novel framework that combines general visual representations with task-specific features using a global semantic affinity loss, enhancing weakly supervised SOD performance.
Findings
Outperforms state-of-the-art weakly supervised SOD methods on five benchmarks.
Achieves comparable or superior results to fully supervised models.
Improves accuracy with only scribble annotations, no extra labels.
Abstract
Fully supervised salient object detection (SOD) methods have made considerable progress in performance, yet these models rely heavily on expensive pixel-wise labels. Recently, to achieve a trade-off between labeling burden and performance, scribble-based SOD methods have attracted increasing attention. Previous scribble-based models directly implement the SOD task only based on SOD training data with limited information, it is extremely difficult for them to understand the image and further achieve a superior SOD task. In this paper, we propose a simple yet effective framework guided by general visual representations with rich contextual semantic knowledge for scribble-based SOD. These general visual representations are generated by self-supervised learning based on large-scale unlabeled datasets. Our framework consists of a task-related encoder, a general visual module, and an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Face Recognition and Perception
MethodsMasked autoencoder
