Characterizing Universal Object Representations Across Vision Models

Florian P. Mahner; Johannes Roth; Ka Chun Lam; Michael F. Bonner; Francisco Pereira; and Martin N. Hebart

arXiv:2605.13675·cs.CV·May 14, 2026

Characterizing Universal Object Representations Across Vision Models

Florian P. Mahner, Johannes Roth, Ka Chun Lam, Michael F. Bonner, Francisco Pereira, and Martin N. Hebart

PDF

TL;DR

This study decomposes and analyzes the visual representations of 162 diverse deep neural networks to identify universal, interpretable dimensions that align with biological vision and are unaffected by model-specific factors.

Contribution

It introduces a method to identify universal object representation dimensions across models and links these to biological vision and semantic properties.

Findings

01

Universal dimensions are more interpretable and driven by conceptual image properties.

02

Differences in architecture, training data, or size do not explain universality.

03

Models with more universal dimensions better predict biological visual responses.

Abstract

Deep neural networks trained with different architectures, objectives, and datasets have been reported to converge on similar visual representations. However, what remains unknown is which visual properties models actually converge on and which factors may underlie this convergence. To address this, we decompose the object similarity structure of 162 diverse vision models into a small set of non-negative dimensions. To determine universal versus model-specific dimensions, we then estimate how often each dimension reappears across models. In contrast to model-specific dimensions, universal dimensions are more interpretable and more strongly driven by conceptual image properties, indicating the relevance of interpretability and semantic content as implicit factors driving universality across models. Differences in architecture, objective function, training data, model size, and model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.