Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning
Julien Denize, Jaonary Rabarisoa, Astrid Orcesi, Romain H\'erault

TL;DR
This paper introduces Similarity Contrastive Estimation (SCE), a novel self-supervised learning method that leverages semantic similarities among instances to improve image and video representations, outperforming existing contrastive methods.
Contribution
The paper proposes SCE, a soft contrastive learning framework that incorporates semantic similarities, addressing limitations of traditional NCE-based contrastive learning.
Findings
SCE achieves competitive results on ImageNet with fewer epochs.
SCE outperforms state-of-the-art in video pretraining.
Learned representations generalize well to downstream tasks.
Abstract
Contrastive representation learning has proven to be an effective self-supervised learning method for images and videos. Most successful approaches are based on Noise Contrastive Estimation (NCE) and use different views of an instance as positives that should be contrasted with other instances, called negatives, that are considered as noise. However, several instances in a dataset are drawn from the same distribution and share underlying semantic information. A good data representation should contain relations between the instances, or semantic similarity and dissimilarity, that contrastive learning harms by considering all negatives as noise. To circumvent this issue, we propose a novel formulation of contrastive learning using semantic similarity between instances called Similarity Contrastive Estimation (SCE). Our training objective is a soft contrastive one that brings the positives…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Image and Signal Denoising Methods · Photoacoustic and Ultrasonic Imaging
MethodsContrastive Learning
