Beyond Accuracy: Statistical Measures and Benchmark for Evaluation of Representation from Self-Supervised Learning
Jiantao Wu, Shentong Mo, Sara Atito, Josef Kittler, Zhenhua Feng,, Muhammad Awais

TL;DR
This paper introduces a large-scale, diverse benchmark and new statistical metrics for evaluating self-supervised learning representations, revealing limitations of current models and offering a more nuanced assessment beyond accuracy.
Contribution
It presents the Statistical Metric Learning Benchmark (SMLB) based on ImageNet-21K and WordNet, along with novel metrics 'overlap' and 'aSTD' for comprehensive evaluation.
Findings
SSL models exhibit class bias and limited discriminative ability.
The new metrics provide robust, efficient evaluation of representation quality.
Benchmark reveals gaps in supervised learning and areas for improvement.
Abstract
Recently, self-supervised metric learning has raised attention for the potential to learn a generic distance function. It overcomes the limitations of conventional supervised one, e.g., scalability and label biases. Despite progress in this domain, current benchmarks, incorporating a narrow scope of classes, stop the nuanced evaluation of semantic representations. To bridge this gap, we introduce a large-scale benchmark with diversity and granularity of classes, Statistical Metric Learning Benchmark (SMLB) built upon ImageNet-21K and WordNet. SMLB is designed to rigorously evaluate the discriminative discernment and generalizability across more than 14M images, 20K classes, and 16K taxonomic nodes. Alongside, we propose novel evaluation metrics -- `overlap' for separability and `aSTD' for consistency -- to measure distance statistical information, which are efficient and robust to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Digital Imaging for Blood Diseases · Domain Adaptation and Few-Shot Learning
