Unsupervised Salient Object Detection with Spectral Cluster Voting
Gyungin Shin, Samuel Albanie, Weidi Xie

TL;DR
This paper introduces a novel unsupervised salient object detection method that uses spectral clustering on self-supervised features, combined with a voting mechanism, to generate pseudo groundtruth masks for training a new detector called SelfMask.
Contribution
It revisits spectral clustering for pixel grouping, proposes a voting scheme for mask selection from multiple self-supervised models, and trains a superior unsupervised SOD detector called SelfMask.
Findings
Outperforms prior unsupervised SOD methods on benchmarks.
Effective spectral clustering-based mask proposal generation.
SelfMask achieves state-of-the-art results without supervision.
Abstract
In this paper, we tackle the challenging task of unsupervised salient object detection (SOD) by leveraging spectral clustering on self-supervised features. We make the following contributions: (i) We revisit spectral clustering and demonstrate its potential to group the pixels of salient objects; (ii) Given mask proposals from multiple applications of spectral clustering on image features computed from various self-supervised models, e.g., MoCov2, SwAV, DINO, we propose a simple but effective winner-takes-all voting mechanism for selecting the salient masks, leveraging object priors based on framing and distinctiveness; (iii) Using the selected object segmentation as pseudo groundtruth masks, we train a salient object detector, dubbed SelfMask, which outperforms prior approaches on three unsupervised SOD benchmarks. Code is publicly available at https://github.com/NoelShin/selfmask.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Face Recognition and Perception
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Layer Normalization · Linear Layer · Dense Connections · Residual Connection · Vision Transformer · LARS · Spectral Clustering
