Evaluating Representational Similarity Measures from the Lens of Functional Correspondence
Yiqing Bo, Ansh Soni, Sudhanshu Srivastava, Meenakshi Khosla

TL;DR
This paper evaluates various representational similarity measures in neuroscience and AI, finding that geometry-focused metrics like CKA and Procrustes excel at aligning with behavioral data, guiding better metric selection.
Contribution
It systematically compares eight similarity metrics in the visual domain, highlighting which best capture behavioral relevance and model distinctions.
Findings
CKA and Procrustes outperform others in behavioral alignment
Geometry-based metrics better differentiate trained vs. untrained models
Linear predictivity shows moderate behavioral correlation
Abstract
Neuroscience and artificial intelligence (AI) both face the challenge of interpreting high-dimensional neural data, where the comparative analysis of such data is crucial for revealing shared mechanisms and differences between these complex systems. Despite the widespread use of representational comparisons and the abundance classes of comparison methods, a critical question remains: which metrics are most suitable for these comparisons? While some studies evaluate metrics based on their ability to differentiate models of different origins or constructions (e.g., various architectures), another approach is to assess how well they distinguish models that exhibit distinct behaviors. To investigate this, we examine the degree of alignment between various representational similarity measures and behavioral outcomes, employing group statistics and a comprehensive suite of behavioral metrics…
Peer Reviews
Decision·Submitted to ICLR 2025
The paper is mostly straightforward and clear. Although an empirical study, it tackles a useful question. Figures 4 and 5 provide a clear and simple message.
The captions for each figure can be improved by including more information about what properties are being averaged over and what are being correlated (e.g. the dots in Figures 3 represent datasets). The use of the word “behavior” is awkward to this reader, particularly when (unless I’m mistaken) the only thing you are considering is classification. This also makes the results sound much more general than they likely are. It is certainly reasonable that the success of CKA and Procrustes in
Deeper understanding of the representational similarity measures is an important topic. Comparisons to the actual classification behaviour of the models is an interesting new viewpoint. The basic results fit with earlier analyses and appear to be solid.
While I generally believe the results and they are somewhat interesting, I think the analyses could be much more thorough and broad. The formula for RSA comparisons is wrong. To represent the classic formulation of RSA requires a standardisation of the X values and an important vectorisation step implied in $\tau$. And there are many more modern and preferred variations of this technique today. Similarly the linear encoding model description given here says nothing about the important steps of
There are several representational similarity metrics in the literature and there is relatively little understanding of their functional relationship. This work addresses this problem by comparing several metrics on a variety of behavioral tasks and models. The main strength of this paper is that it evaluates several similarity metrics (a total of 8 or 12 depending on if each k-NN is counted separately) and behavioral metrics (a total of 9). Thus, this works offers benchmarks for the representat
In many ways, this paper seems incomplete. Much of the paper reads like a methods paper even though no new methods are introduced. Arguably, much of sections 1.1 and 1.2 could be relegated to the appendices. Apart from measuring the correlation between similarity metrics and behavioral metrics, there is little interpretation or investigation in the results section. In this way, this paper has the feel of exploratory analysis without follow up scientific hypothesis and analyses. Overall, I think
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Cognitive Science and Education Research
MethodsProcrustes
