The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning
Zhenmei Shi, Jiefeng Chen, Kunyang Li, Jayaram Raghuram, Xi Wu, Yingyu, Liang, Somesh Jha

TL;DR
This paper investigates the inherent trade-off in contrastive learning between creating universal representations and achieving label efficiency, providing theoretical insights and a regularization method to balance these goals.
Contribution
It offers a theoretical analysis of the trade-off in contrastive learning and proposes a regularization technique to improve the balance between universality and label efficiency.
Findings
More diverse pre-training data enhances universality but increases sample complexity.
The proposed regularization improves the trade-off in practical settings.
Empirical validation confirms the theoretical analysis.
Abstract
Pre-training representations (a.k.a. foundation models) has recently become a prevalent learning paradigm, where one first pre-trains a representation using large-scale unlabeled data, and then learns simple predictors on top of the representation using small labeled data from the downstream tasks. There are two key desiderata for the representation: label efficiency (the ability to learn an accurate classifier on top of the representation with a small amount of labeled data) and universality (usefulness across a wide range of downstream tasks). In this paper, we focus on one of the most popular instantiations of this paradigm: contrastive learning with linear probing, i.e., learning a linear predictor on the representation pre-trained by contrastive learning. We show that there exists a trade-off between the two desiderata so that one may not be able to achieve both simultaneously.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Multimodal Machine Learning Applications
MethodsContrastive Learning
