A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification
Markus Marks, Manuel Knott, Neehar Kondapaneni, Elijah Cole, Thijs, Defraeye, Fernando Perez-Cruz, Pietro Perona

TL;DR
This paper investigates how well common in-domain evaluation protocols predict the performance of self-supervised learning models on various downstream image classification tasks, highlighting the influence of dataset type, model architecture, and normalization.
Contribution
It provides a comprehensive analysis of the correlation between in-domain evaluation metrics and out-of-domain performance across multiple datasets and models in SSL.
Findings
Linear and kNN probing are strong predictors of out-of-domain performance.
Batch normalization significantly affects correlation robustness.
Discriminative and generative SSL methods' performance differences are mainly due to backbone architecture.
Abstract
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels. The model is forced to learn about the data structure or context by solving a pretext task. With SSL, models can learn from abundant and cheap unlabeled data, significantly reducing the cost of training models where labels are expensive or inaccessible. In Computer Vision, SSL is widely used as pre-training followed by a downstream task, such as supervised transfer, few-shot learning on smaller labeled data sets, and/or unsupervised clustering. Unfortunately, it is infeasible to evaluate SSL methods on all possible downstream tasks and objectively measure the quality of the learned representation. Instead, SSL methods are evaluated using in-domain evaluation protocols, such as fine-tuning, linear probing, and k-nearest neighbors (kNN).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Resource Development and Performance Evaluation
MethodsBatch Normalization
