Using Representation Expressiveness and Learnability to Evaluate   Self-Supervised Learning Methods

Yuchen Lu; Zhen Liu; Aristide Baratin; Romain Laroche; Aaron; Courville; Alessandro Sordoni

arXiv:2206.01251·cs.LG·November 16, 2023

Using Representation Expressiveness and Learnability to Evaluate Self-Supervised Learning Methods

Yuchen Lu, Zhen Liu, Aristide Baratin, Romain Laroche, Aaron, Courville, Alessandro Sordoni

PDF

Open Access

TL;DR

This paper introduces CLID, a new evaluation method for self-supervised learning models based on representation expressiveness and learnability, which correlates well with model performance both in-distribution and out-of-domain.

Contribution

The paper proposes CLID, combining Intrinsic Dimension and Cluster Learnability, as a novel, label-free evaluation metric for SSL models that outperforms existing schemes.

Findings

01

CLID correlates better with in-distribution performance.

02

CLID effectively predicts out-of-domain transfer performance.

03

The approach is validated across diverse SSL algorithms.

Abstract

We address the problem of evaluating the quality of self-supervised learning (SSL) models without access to supervised labels, while being agnostic to the architecture, learning algorithm or data manipulation used during training. We argue that representations can be evaluated through the lens of expressiveness and learnability. We propose to use the Intrinsic Dimension (ID) to assess expressiveness and introduce Cluster Learnability (CL) to assess learnability. CL is measured in terms of the performance of a KNN classifier trained to predict labels obtained by clustering the representations with K-means. We thus combine CL and ID into a single predictor -- CLID. Through a large-scale empirical study with a diverse family of SSL algorithms, we find that CLID better correlates with in-distribution model performance than other competing recent evaluation schemes. We also benchmark CLID on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Text and Document Classification Technologies · Machine Learning and Data Classification

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings