Label-free Monitoring of Self-Supervised Learning Progress
Isaac Xu, Scott Lowe, and Thomas Trappenberg

TL;DR
This paper explores label-free metrics like clustering quality and entropy to monitor self-supervised learning progress, aiming to replace reliance on annotated data for evaluating encoder quality.
Contribution
It introduces and evaluates unlabelled data metrics such as silhouette score, clustering agreement, and entropy, comparing their effectiveness to traditional linear probe accuracy across different SSL methods.
Findings
Clustering metrics correlate with linear probe accuracy for SimCLR and MoCo-v2.
Entropy stabilizes and correlates better at later training stages.
Entropy may be architecture-independent, unlike clustering metrics.
Abstract
Self-supervised learning (SSL) is an effective method for exploiting unlabelled data to learn a high-level embedding space that can be used for various downstream tasks. However, existing methods to monitor the quality of the encoder -- either during training for one model or to compare several trained models -- still rely on access to annotated data. When SSL methodologies are applied to new data domains, a sufficiently large labelled dataset may not always be available. In this study, we propose several evaluation metrics which can be applied on the embeddings of unlabelled data and investigate their viability by comparing them to linear probe accuracy (a common metric which utilizes an annotated dataset). In particular, we apply -means clustering and measure the clustering quality with the silhouette score and clustering agreement. We also measure the entropy of the embedding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Bitcoin Customer Service Number +1-833-534-1729 · Kaiming Initialization · Max Pooling · Convolution · Average Pooling · Global Average Pooling · Dense Connections · Color Jitter · Random Gaussian Blur
