What Should Not Be Contrastive in Contrastive Learning
Tete Xiao, Xiaolong Wang, Alexei A. Efros, Trevor Darrell

TL;DR
This paper proposes a contrastive learning framework that separately captures invariant and varying factors in visual data, improving transferability across diverse downstream tasks without prior knowledge of specific invariances.
Contribution
It introduces a multi-head contrastive learning model with separate embedding spaces for invariant and varying features, outperforming existing methods on multiple downstream tasks.
Findings
Separate invariant and varying spaces improve downstream performance.
Concatenation of both spaces yields the best results.
The method outperforms baselines on various classification and corruption tasks.
Abstract
Recent self-supervised contrastive methods have been able to produce impressive transferable visual representations by learning to be invariant to different data augmentations. However, these methods implicitly assume a particular set of representational invariances (e.g., invariance to color), and can perform poorly when a downstream task violates this assumption (e.g., distinguishing red vs. yellow cars). We introduce a contrastive learning framework which does not require prior knowledge of specific, task-dependent invariances. Our model learns to capture varying and invariant factors for visual representations by constructing separate embedding spaces, each of which is invariant to all but one augmentation. We use a multi-head network with a shared backbone which captures information across each augmentation and alone outperforms all baselines on downstream tasks. We further find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
MethodsContrastive Learning
