Measuring the Interpretability of Unsupervised Representations via Quantized Reverse Probing
Iro Laina, Yuki M. Asano, Andrea Vedaldi

TL;DR
This paper introduces a novel method called reverse linear probing to measure the interpretability of self-supervised visual representations by estimating the mutual information between representations and labeled concepts, revealing semantic content and concept combinations.
Contribution
It proposes a new interpretability metric based on reverse linear probing, capable of detecting semantic and compositional information in representations, and introduces automatic concept labeling for large datasets.
Findings
The method effectively ranks representations by interpretability.
It detects when representations encode concept combinations.
It reveals differences from standard linear probe evaluations.
Abstract
Self-supervised visual representation learning has recently attracted significant research interest. While a common way to evaluate self-supervised representations is through transfer to various downstream tasks, we instead investigate the problem of measuring their interpretability, i.e. understanding the semantics encoded in raw representations. We formulate the latter as estimating the mutual information between the representation and a space of manually labelled concepts. To quantify this we introduce a decoding bottleneck: information must be captured by simple predictors, mapping concepts to clusters in representation space. This approach, which we call reverse linear probing, provides a single number sensitive to the semanticity of the representation. This measure is also able to detect when the representation contains combinations of concepts (e.g., "red apple") instead of just…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and Data Classification
