Self-Supervised Contrastive Learning is Approximately Supervised Contrastive Learning
Achleshwar Luthra, Tianbao Yang, Tomer Galanti

TL;DR
This paper demonstrates that self-supervised contrastive learning implicitly approximates a supervised variant, with theoretical analysis showing the gap diminishes as class count increases, and validates findings through empirical experiments.
Contribution
It establishes a theoretical connection between self-supervised and supervised contrastive learning, characterizes the structure of learned representations, and provides bounds on few-shot learning performance.
Findings
The gap between CL and NSCL losses decays as the number of classes increases.
CL and NSCL losses are highly correlated.
Minimizing CL loss implicitly minimizes NSCL loss, supporting effective few-shot learning.
Abstract
Despite its empirical success, the theoretical foundations of self-supervised contrastive learning (CL) are not yet fully established. In this work, we address this gap by showing that standard CL objectives implicitly approximate a supervised variant we call the negatives-only supervised contrastive loss (NSCL), which excludes same-class contrasts. We prove that the gap between the CL and NSCL losses vanishes as the number of semantic classes increases, under a bound that is both label-agnostic and architecture-independent. We characterize the geometric structure of the global minimizers of the NSCL loss: the learned representations exhibit augmentation collapse, within-class collapse, and class centers that form a simplex equiangular tight frame. We further introduce a new bound on the few-shot error of linear-probing. This bound depends on two measures of feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Face and Expression Recognition
MethodsSupervised Contrastive Loss · Contrastive Learning
