Understanding Collapse in Non-Contrastive Siamese Representation Learning
Alexander C. Li, Alexei A. Efros, Deepak Pathak

TL;DR
This paper investigates why non-contrastive self-supervised learning methods like SimSiam sometimes collapse and how dataset size, architecture, and training strategies influence their performance, proposing metrics and techniques to prevent collapse and improve results.
Contribution
It provides an empirical analysis of collapse in non-contrastive SSL methods, introduces a metric to predict collapse, and demonstrates how continual learning can mitigate collapse and boost performance.
Findings
SimSiam is sensitive to dataset and model size.
A metric can forecast downstream performance without labels.
Continual learning prevents collapse and improves accuracy.
Abstract
Contrastive methods have led a recent surge in the performance of self-supervised representation learning (SSL). Recent methods like BYOL or SimSiam purportedly distill these contrastive methods down to their essence, removing bells and whistles, including the negative examples, that do not contribute to downstream performance. These "non-contrastive" methods work surprisingly well without using negatives even though the global minimum lies at trivial collapse. We empirically analyze these non-contrastive methods and find that SimSiam is extraordinarily sensitive to dataset and model size. In particular, SimSiam representations undergo partial dimensional collapse if the model is too small relative to the dataset size. We propose a metric to measure the degree of this collapse and show that it can be used to forecast the downstream task performance without any fine-tuning or labels. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
MethodsBootstrap Your Own Latent
