Exploring Set Similarity for Dense Self-supervised Representation Learning
Zhaoqing Wang, Qiang Li, Guoxin Zhang, Pengfei Wan, Wen Zheng, Nannan, Wang, Mingming Gong, Tongliang Liu

TL;DR
This paper introduces SetSim, a set similarity approach for dense self-supervised learning that improves robustness by filtering noise and maintaining semantic coherence, leading to better performance on various dense prediction tasks.
Contribution
The paper proposes a novel set similarity method that generalizes pixel-wise similarity, utilizing attentional features and structured neighborhood information for more robust dense representation learning.
Findings
SetSim outperforms state-of-the-art methods on object detection.
SetSim improves keypoint detection accuracy.
SetSim enhances instance and semantic segmentation results.
Abstract
By considering the spatial correspondence, dense self-supervised representation learning has achieved superior performance on various dense prediction tasks. However, the pixel-level correspondence tends to be noisy because of many similar misleading pixels, e.g., backgrounds. To address this issue, in this paper, we propose to explore \textbf{set} \textbf{sim}ilarity (SetSim) for dense self-supervised representation learning. We generalize pixel-wise similarity learning to set-wise one to improve the robustness because sets contain more semantic and structure information. Specifically, by resorting to attentional features of views, we establish corresponding sets, thus filtering out noisy backgrounds that may cause incorrect correspondences. Meanwhile, these attentional features can keep the coherence of the same image across different views to alleviate semantic inconsistency. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
