Understanding the Behaviour of Contrastive Loss
Feng Wang, Huaping Liu

TL;DR
This paper analyzes the behavior of unsupervised contrastive loss, revealing its role as a hardness-aware function controlled by temperature, and discusses the trade-off between uniformity and tolerance affecting feature learning and downstream tasks.
Contribution
It provides a theoretical understanding of contrastive loss, highlighting the uniformity-tolerance dilemma and the impact of temperature on feature quality and downstream performance.
Findings
Contrastive loss is hardness-aware and temperature controls penalty strength.
Uniformity promotes separable features but can harm semantic structure.
Optimal temperature balances uniformity and tolerance, improving downstream results.
Abstract
Unsupervised contrastive learning has achieved outstanding success, while the mechanism of contrastive loss has been less studied. In this paper, we concentrate on the understanding of the behaviours of unsupervised contrastive loss. We will show that the contrastive loss is a hardness-aware loss function, and the temperature {\tau} controls the strength of penalties on hard negative samples. The previous study has shown that uniformity is a key property of contrastive learning. We build relations between the uniformity and the temperature {\tau} . We will show that uniformity helps the contrastive learning to learn separable features, however excessive pursuit to the uniformity makes the contrastive loss not tolerant to semantically similar samples, which may break the underlying semantic structure and be harmful to the formation of features useful for downstream tasks. This is caused…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
MethodsContrastive Learning
