Self-supervised Pretraining of Visual Features in the Wild
Priya Goyal, Mathilde Caron, Benjamin Lefaudeux, Min Xu, Pengchao, Wang, Vivek Pai, Mannat Singh, Vitaliy Liptchinsky, Ishan Misra, Armand, Joulin, Piotr Bojanowski

TL;DR
This paper demonstrates that large self-supervised models trained on uncurated, random images in real-world settings can achieve high accuracy, confirming the practical viability of self-supervised learning beyond curated datasets.
Contribution
The authors train a large-scale self-supervised model on uncurated images, showing it surpasses previous models and performs well in real-world scenarios.
Findings
SEER model achieves 84.2% top-1 accuracy on ImageNet.
Self-supervised models are effective few-shot learners.
Training on uncurated data is feasible and effective.
Abstract
Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗timm/regnety_320.seermodel· 52 dl52 dl
- 🤗timm/regnety_320.seer_ft_in1kmodel· 60 dl60 dl
- 🤗timm/regnety_640.seermodel· 89 dl89 dl
- 🤗timm/regnety_640.seer_ft_in1kmodel· 67 dl67 dl
- 🤗timm/regnety_1280.seermodel· 59 dl59 dl
- 🤗timm/regnety_1280.seer_ft_in1kmodel· 32 dl32 dl
- 🤗timm/regnety_2560.seer_ft_in1kmodel· 61 dl61 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Video Surveillance and Tracking Methods
MethodsCosine Annealing · SEER · Gradient Checkpointing · Bootstrap Your Own Latent · Residual Connection · Bottleneck Residual Block · Kaiming Initialization · Residual Block · Color Jitter · Max Pooling
