TL;DR
This paper systematically evaluates 13 top self-supervised visual models across 40 diverse downstream tasks, finding they often outperform supervised models and reveal insights into their feature representations and transfer capabilities.
Contribution
It provides the first large-scale, comprehensive comparison of self-supervised models on multiple tasks, highlighting their strengths, limitations, and differences from supervised learning.
Findings
Self-supervised models often outperform supervised baselines on many tasks.
ImageNet Top-1 accuracy correlates with many-shot transfer but less with few-shot and dense tasks.
Self-supervised models tend to preserve less color information but have better calibration and less overfitting.
Abstract
Self-supervised visual representation learning has seen huge progress recently, but no large scale evaluation has compared the many models now available. We evaluate the transfer performance of 13 top self-supervised models on 40 downstream tasks, including many-shot and few-shot recognition, object detection, and dense prediction. We compare their performance to a supervised baseline and show that on most tasks the best self-supervised models outperform supervision, confirming the recently observed trend in the literature. We find ImageNet Top-1 accuracy to be highly correlated with transfer to many-shot recognition, but increasingly less so for few-shot, object detection and dense prediction. No single self-supervised method dominates overall, suggesting that universal pre-training is still unsolved. Our analysis of features suggests that top self-supervised learners fail to preserve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMoCo v2 · Momentum Contrast · PIRL · SimCLR · NPID · Swapping Assignments between Views · Bootstrap Your Own Latent
