Towards a Learning Theory of Representation Alignment
Francesco Insulla, Shuo Huang, Lorenzo Rosasco

TL;DR
This paper offers a learning-theoretic framework for understanding representation alignment in AI models, connecting various notions of alignment and analyzing the role of stitching and kernel alignment.
Contribution
It introduces a novel learning-theoretic perspective on representation alignment, linking stitching to kernel alignment and advancing theoretical understanding.
Findings
Relates stitching properties to kernel alignment of representations
Connects metric, probabilistic, and spectral notions of alignment
Presents a first step towards formalizing representation alignment as a learning problem
Abstract
It has recently been argued that AI models' representations are becoming aligned as their scale and performance increase. Empirical analyses have been designed to support this idea and conjecture the possible alignment of different representations toward a shared statistical model of reality. In this paper, we propose a learning-theoretic perspective to representation alignment. First, we review and connect different notions of alignment based on metric, probabilistic, and spectral ideas. Then, we focus on stitching, a particular approach to understanding the interplay between different representations in the context of a task. Our main contribution here is relating properties of stitching to the kernel alignment of the underlying representation. Our results can be seen as a first step toward casting representation alignment as a learning-theoretic problem.
Peer Reviews
Decision·ICLR 2025 Poster
The manuscript is clear and well-written. Although perhaps a little dense, the first section does a good job of introducing various notions of representation alignment and elucidating their relationships. I found this precise and succinct presentation to be useful and digestible. The formalisation of stitching methods is also clear and straightforward. While the assumption of linear stichers is quite restrictive, it does lead to some nice proofs.
I don’t think the manuscript has any significant weaknesses. I do think that at the end of section 3 it might be helpful to give a few sentence summary of the relationships between the various measures of alignment. While the preceding material is complete, it would help readers who perhaps have not digested all of that material on a first pass to more easily comprehend the manuscript. As mentioned above, the restriction to linear stitching functions is potentially quite limiting. It might be
While I am not familiar with all the references,the first section of the paper where the authors make a overview of different formulation of representation alignment and provide connected interpretation from different community seems to be a nice contribution. The theoretical formulation of stitching method that is used frequently in practice is relevant point and the results can give valuable insights to use of stitching method in practice In general the paper is well-written and well structu
- I appreciate the first section of the paper, but it would be nice to have the last paragraph where authors could summarize and make a brief overview of everything in one place. - While it being a solid theoretical work, I think it is great that the settings in questions are still relevant for practice. Having a couple of sentences summarizing practical implication of the results in conclusion might be helpful for broader audience. - It is little confusing about significance of 'task defined
- The authors do a good job of introducing the different concepts of kernel alignment and independence testing needed for their theory. - Using Kernel Alignment to provide a generalisation error bound for stitching seems novel and useful given the rise of representation learning. - I think the paper is generally well written making it accessible for readers that are not that familiar with a learning theoretic perspective on this problem.
- It is unclear to me from Section 4 how the stitching error and stitching error bound can be used. Since the paper makes reference to existing empirical work, I would like to see a more concrete example or a reference to where the derived theory fits in existing empirical work. - For instance how does the proposed theory quantify the stitching error in current approaches (e.g., can different layers be compared equivalently, does normalisation, regularisation play a role?). - I feel there is a s
Videos
Taxonomy
TopicsLanguage and cultural evolution
MethodsFocus
