Big Self-Supervised Models are Strong Semi-Supervised Learners
Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, Geoffrey, Hinton

TL;DR
This paper demonstrates that large self-supervised models, when fine-tuned with limited labeled data, can serve as highly effective semi-supervised learners, significantly reducing label requirements for ImageNet classification.
Contribution
It introduces a semi-supervised learning approach combining big models, self-supervised pretraining, and distillation, achieving state-of-the-art label efficiency on ImageNet.
Findings
Achieves 73.9% top-1 accuracy with only 1% labels on ImageNet.
Outperforms supervised training with all labels at 10% labels.
Big models benefit more from unlabeled data when fewer labels are available.
Abstract
One paradigm for learning from few labeled examples while making best use of a large amount of unlabeled data is unsupervised pretraining followed by supervised fine-tuning. Although this paradigm uses unlabeled data in a task-agnostic way, in contrast to common approaches to semi-supervised learning for computer vision, we show that it is surprisingly effective for semi-supervised learning on ImageNet. A key ingredient of our approach is the use of big (deep and wide) networks during pretraining and fine-tuning. We find that, the fewer the labels, the more this approach (task-agnostic use of unlabeled data) benefits from a bigger network. After fine-tuning, the big network can be further improved and distilled into a much smaller one with little loss in classification accuracy by using the unlabeled examples for a second time, but in a task-specific way. The proposed semi-supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗lightly-ai/simclrv2-imagenet1k-r50_1x_sk0model
- 🤗lightly-ai/simclrv2-imagenet1k-r50_1x_sk1model
- 🤗lightly-ai/simclrv2-imagenet1k-r50_2x_sk0model
- 🤗lightly-ai/simclrv2-imagenet1k-r50_2x_sk1model
- 🤗lightly-ai/simclrv2-imagenet1k-r101_1x_sk0model
- 🤗lightly-ai/simclrv2-imagenet1k-r101_1x_sk1model
- 🤗lightly-ai/simclrv2-imagenet1k-r101_2x_sk0model
- 🤗lightly-ai/simclrv2-imagenet1k-r101_2x_sk1model
- 🤗lightly-ai/simclrv2-imagenet1k-r152_1x_sk0model
- 🤗lightly-ai/simclrv2-imagenet1k-r152_1x_sk1model
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
MethodsSimCLRv2 · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Bottleneck Residual Block · Batch Normalization · Average Pooling · Max Pooling · Global Average Pooling · Residual Connection · Kaiming Initialization
