Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage
Yu Gui, Cong Ma, Yiqiao Zhong

TL;DR
This paper investigates how contrastive learning projectors influence representation quality, revealing expansion and shrinkage effects that impact downstream classification performance, supported by theoretical analysis and extensive experiments.
Contribution
It introduces the concepts of expansion and shrinkage effects caused by contrastive loss and models projector behavior with linear transformations to explain downstream accuracy.
Findings
Contrastive loss induces expansion or shrinkage in representations.
Linear projectors in shrinkage regime hinder downstream accuracy.
Theoretical analysis matches experimental results on synthetic and real data.
Abstract
We investigate the role of projection heads, also known as projectors, within the encoder-projector framework (e.g., SimCLR) used in contrastive learning. We aim to demystify the observed phenomenon where representations learned before projectors outperform those learned after -- measured using the downstream linear classification accuracy, even when the projectors themselves are linear. In this paper, we make two significant contributions towards this aim. Firstly, through empirical and theoretical analysis, we identify two crucial effects -- expansion and shrinkage -- induced by the contrastive loss on the projectors. In essence, contrastive loss either expands or shrinks the signal direction in the representations learned by an encoder, depending on factors such as the augmentation strength, the temperature used in contrastive loss, etc. Secondly, drawing inspiration from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Reservoir Computing · Domain Adaptation and Few-Shot Learning
