TL;DR
This paper investigates the internal concepts learned by DINOv2, revealing how different tasks utilize these concepts and proposing a geometric interpretation based on Minkowski spaces.
Contribution
It introduces the Minkowski Representation Hypothesis and provides a detailed analysis of the learned concepts' geometry and their functional roles.
Findings
Downstream tasks recruit distinct concept groups, such as negations for classification and boundary detectors for segmentation.
Representations are partly dense and organized beyond simple sparsity, resembling convex mixtures of archetypes.
Tokens form low-dimensional, locally connected sets that can be interpreted through Minkowski geometry.
Abstract
DINOv2 is routinely deployed to recognize objects, scenes, and actions; yet the nature of what it perceives remains unknown. As a working baseline, we adopt the Linear Representation Hypothesis (LRH) and operationalize it using SAEs, producing a 32,000-unit dictionary that serves as the interpretability backbone of our study, which unfolds in three parts. In the first part, we analyze how different downstream tasks recruit concepts from our learned dictionary, revealing functional specialization: classification exploits "Elsewhere" concepts that fire everywhere except on target objects, implementing learned negations; segmentation relies on boundary detectors forming coherent subspaces; depth estimation draws on three distinct monocular depth cues matching visual neuroscience principles. Following these functional results, we analyze the geometry and statistics of the concepts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
