A Closer Look at Benchmarking Self-Supervised Pre-training with Image   Classification

Markus Marks; Manuel Knott; Neehar Kondapaneni; Elijah Cole; Thijs; Defraeye; Fernando Perez-Cruz; Pietro Perona

arXiv:2407.12210·cs.CV·July 19, 2024·1 cites

A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification

Markus Marks, Manuel Knott, Neehar Kondapaneni, Elijah Cole, Thijs, Defraeye, Fernando Perez-Cruz, Pietro Perona

PDF

Open Access

TL;DR

This paper investigates how well common in-domain evaluation protocols predict the performance of self-supervised learning models on various downstream image classification tasks, highlighting the influence of dataset type, model architecture, and normalization.

Contribution

It provides a comprehensive analysis of the correlation between in-domain evaluation metrics and out-of-domain performance across multiple datasets and models in SSL.

Findings

01

Linear and kNN probing are strong predictors of out-of-domain performance.

02

Batch normalization significantly affects correlation robustness.

03

Discriminative and generative SSL methods' performance differences are mainly due to backbone architecture.

Abstract

Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels. The model is forced to learn about the data structure or context by solving a pretext task. With SSL, models can learn from abundant and cheap unlabeled data, significantly reducing the cost of training models where labels are expensive or inaccessible. In Computer Vision, SSL is widely used as pre-training followed by a downstream task, such as supervised transfer, few-shot learning on smaller labeled data sets, and/or unsupervised clustering. Unfortunately, it is infeasible to evaluate SSL methods on all possible downstream tasks and objectively measure the quality of the learned representation. Instead, SSL methods are evaluated using in-domain evaluation protocols, such as fine-tuning, linear probing, and k-nearest neighbors (kNN).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Resource Development and Performance Evaluation

MethodsBatch Normalization