Beyond Accuracy: Statistical Measures and Benchmark for Evaluation of   Representation from Self-Supervised Learning

Jiantao Wu; Shentong Mo; Sara Atito; Josef Kittler; Zhenhua Feng,; Muhammad Awais

arXiv:2312.01118·cs.CV·December 5, 2023·1 cites

Beyond Accuracy: Statistical Measures and Benchmark for Evaluation of Representation from Self-Supervised Learning

Jiantao Wu, Shentong Mo, Sara Atito, Josef Kittler, Zhenhua Feng,, Muhammad Awais

PDF

Open Access

TL;DR

This paper introduces a large-scale, diverse benchmark and new statistical metrics for evaluating self-supervised learning representations, revealing limitations of current models and offering a more nuanced assessment beyond accuracy.

Contribution

It presents the Statistical Metric Learning Benchmark (SMLB) based on ImageNet-21K and WordNet, along with novel metrics 'overlap' and 'aSTD' for comprehensive evaluation.

Findings

01

SSL models exhibit class bias and limited discriminative ability.

02

The new metrics provide robust, efficient evaluation of representation quality.

03

Benchmark reveals gaps in supervised learning and areas for improvement.

Abstract

Recently, self-supervised metric learning has raised attention for the potential to learn a generic distance function. It overcomes the limitations of conventional supervised one, e.g., scalability and label biases. Despite progress in this domain, current benchmarks, incorporating a narrow scope of classes, stop the nuanced evaluation of semantic representations. To bridge this gap, we introduce a large-scale benchmark with diversity and granularity of classes, Statistical Metric Learning Benchmark (SMLB) built upon ImageNet-21K and WordNet. SMLB is designed to rigorously evaluate the discriminative discernment and generalizability across more than 14M images, 20K classes, and 16K taxonomic nodes. Alongside, we propose novel evaluation metrics -- `overlap' for separability and `aSTD' for consistency -- to measure distance statistical information, which are efficient and robust to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Digital Imaging for Blood Diseases · Domain Adaptation and Few-Shot Learning