How Aligned are Different Alignment Metrics?
Jannis Ahlert, Thomas Klein, Felix Wichmann, Robert Geirhos

TL;DR
This paper investigates the consistency among various neural and behavioral alignment metrics, revealing low correlations and emphasizing the need for integrative benchmarking to accurately assess model alignment with human perception.
Contribution
It provides a comprehensive analysis of multiple alignment metrics, highlights their low correlations, and proposes methods for fair aggregation in benchmarking neural network alignment.
Findings
Pairwise correlations between metrics are low or negative.
Aggregation methods significantly influence overall alignment scores.
Alignment metrics may measure different aspects of neural and behavioral similarity.
Abstract
In recent years, various methods and benchmarks have been proposed to empirically evaluate the alignment of artificial neural networks to human neural and behavioral data. But how aligned are different alignment metrics? To answer this question, we analyze visual data from Brain-Score (Schrimpf et al., 2018), including metrics from the model-vs-human toolbox (Geirhos et al., 2021), together with human feature alignment (Linsley et al., 2018; Fel et al., 2022) and human similarity judgements (Muttenthaler et al., 2022). We find that pairwise correlations between neural scores and behavioral scores are quite low and sometimes even negative. For instance, the average correlation between those 80 models on Brain-Score that were fully evaluated on all 69 alignment metrics we considered is only 0.198. Assuming that all of the employed metrics are sound, this implies that alignment with human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDesign Education and Practice
