Capability $\neq$ Interpretability: Human Interpretability of Vision Foundation Models

Julien Colin; Lore Goetschalckx; Nuria Oliver; Thomas Serre

arXiv:2605.20337·cs.CV·May 21, 2026

Capability $\neq$ Interpretability: Human Interpretability of Vision Foundation Models

Julien Colin, Lore Goetschalckx, Nuria Oliver, Thomas Serre

PDF

TL;DR

This paper introduces a framework to measure human interpretability of vision models, revealing foundation models are less interpretable than supervised ones, with interpretability linked to feature locality and semantic alignment.

Contribution

The authors develop a psychophysics-based framework for quantifying interpretability of vision models and demonstrate its effectiveness across multiple models and protocols.

Findings

01

Foundation models are less interpretable than supervised models.

02

Interpretability correlates with feature locality and semantic alignment.

03

Interpretability does not impact downstream task performance.

Abstract

How interpretable are the features of leading vision models? The question is increasingly pressing as these models move from research benchmarks into high-stakes deployments, yet existing methods cannot answer it reliably. We close this gap with a framework for measuring and comparing the human interpretability of vision models, built around two complementary psychophysics protocols: (1) localizability -- can an observer predict where a feature fires on a novel image? -- and (2) nameability -- can an observer accurately describe what the feature represents? Features are recovered via sparse autoencoders, and a chance-anchored scoring function places every model on a common scale. Applying the framework to six vision transformers -- two supervised ViTs and four foundation models (DINOv2, DINOv3, CLIP, SigLIP) -- we collected more than $15, 000$ behavioral responses, analyzing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.