A Visual Representation-guided Framework with Global Affinity for Weakly   Supervised Salient Object Detection

Binwei Xu; Haoran Liang; Weihua Gong; Ronghua Liang; Peng Chen

arXiv:2302.10697·cs.CV·June 12, 2023·1 cites

A Visual Representation-guided Framework with Global Affinity for Weakly Supervised Salient Object Detection

Binwei Xu, Haoran Liang, Weihua Gong, Ronghua Liang, Peng Chen

PDF

Open Access

TL;DR

This paper introduces a visual representation-guided framework with global affinity for weakly supervised salient object detection, leveraging self-supervised learning and semantic affinity loss to improve accuracy without extra labels.

Contribution

It proposes a novel framework that combines general visual representations with task-specific features using a global semantic affinity loss, enhancing weakly supervised SOD performance.

Findings

01

Outperforms state-of-the-art weakly supervised SOD methods on five benchmarks.

02

Achieves comparable or superior results to fully supervised models.

03

Improves accuracy with only scribble annotations, no extra labels.

Abstract

Fully supervised salient object detection (SOD) methods have made considerable progress in performance, yet these models rely heavily on expensive pixel-wise labels. Recently, to achieve a trade-off between labeling burden and performance, scribble-based SOD methods have attracted increasing attention. Previous scribble-based models directly implement the SOD task only based on SOD training data with limited information, it is extremely difficult for them to understand the image and further achieve a superior SOD task. In this paper, we propose a simple yet effective framework guided by general visual representations with rich contextual semantic knowledge for scribble-based SOD. These general visual representations are generated by self-supervised learning based on large-scale unlabeled datasets. Our framework consists of a task-related encoder, a general visual module, and an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Face Recognition and Perception

MethodsMasked autoencoder