Self-training with Noisy Student improves ImageNet classification
Qizhe Xie, Minh-Thang Luong, Eduard Hovy, Quoc V. Le

TL;DR
Noisy Student Training is a semi-supervised learning method that significantly improves ImageNet classification accuracy and robustness by iteratively training larger models with added noise and pseudo labels.
Contribution
The paper introduces Noisy Student Training, a novel semi-supervised approach that enhances model performance and robustness using larger student models, noise injection, and iterative self-training.
Findings
Achieved 88.4% top-1 accuracy on ImageNet.
Improved robustness on ImageNet-A, ImageNet-C, and ImageNet-P.
Outperformed previous state-of-the-art models with fewer weakly labeled images.
Abstract
We present Noisy Student Training, a semi-supervised learning approach that works well even when labeled data is abundant. Noisy Student Training achieves 88.4% top-1 accuracy on ImageNet, which is 2.0% better than the state-of-the-art model that requires 3.5B weakly labeled Instagram images. On robustness test sets, it improves ImageNet-A top-1 accuracy from 61.0% to 83.7%, reduces ImageNet-C mean corruption error from 45.7 to 28.3, and reduces ImageNet-P mean flip rate from 27.8 to 12.2. Noisy Student Training extends the idea of self-training and distillation with the use of equal-or-larger student models and noise added to the student during learning. On ImageNet, we first train an EfficientNet model on labeled images and use it as a teacher to generate pseudo labels for 300M unlabeled images. We then train a larger EfficientNet as a student model on the combination of labeled and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗keras-io/randaugmentmodel· 2 dl2 dl
- 🤗timm/tf_efficientnet_b0.ns_jft_in1kmodel· 119k dl· ♡ 3119k dl♡ 3
- 🤗timm/tf_efficientnet_b1.ns_jft_in1kmodel· 30k dl30k dl
- 🤗timm/tf_efficientnet_b2.ns_jft_in1kmodel· 32k dl32k dl
- 🤗timm/tf_efficientnet_b3.ns_jft_in1kmodel· 223k dl· ♡ 1223k dl♡ 1
- 🤗timm/tf_efficientnet_b4.ns_jft_in1kmodel· 40k dl40k dl
- 🤗timm/tf_efficientnet_b5.ns_jft_in1kmodel· 28k dl· ♡ 128k dl♡ 1
- 🤗timm/tf_efficientnet_b6.ns_jft_in1kmodel· 3.8k dl· ♡ 13.8k dl♡ 1
- 🤗timm/tf_efficientnet_b7.ns_jft_in1kmodel· 8.5k dl8.5k dl
- 🤗timm/tf_efficientnet_l2.ns_jft_in1kmodel· 193 dl· ♡ 1193 dl♡ 1
Videos
Self-training with Noisy Student improves ImageNet classification (Paper Explained)· youtube
Self-Training With Noisy Student Improves ImageNet Classification· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsTest · RMSProp · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Sigmoid Activation · Batch Normalization · Step Decay · Squeeze-and-Excitation Block
