Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity
Jiachen Jiang, Jinxin Zhou, Zhihui Zhu

TL;DR
This paper investigates layer-wise representation similarity in transformer models, showing that simple cosine similarity aligns with complex metrics, and proposes an aligned training method to enhance early layer effectiveness and interpretability.
Contribution
It introduces a simple cosine similarity measure for analyzing transformer representations, provides theoretical insights, and proposes an aligned training approach to improve shallow layer performance.
Findings
Cosine similarity aligns with CKA in capturing layer-wise similarity.
Representation similarity increases as layers get closer, correlating with model confidence.
Aligned training enhances early layer accuracy and enables effective multi-exit models.
Abstract
Analyzing the similarity of internal representations has been an important technique for understanding the behavior of deep neural networks. Most existing methods for analyzing the similarity between representations of high dimensions, such as those based on Centered Kernel Alignment (CKA), rely on statistical properties of the representations for a set of data points. In this paper, we focus on transformer models and study the similarity of representations between the hidden layers of individual transformers. In this context, we show that a simple sample-wise cosine similarity metric is capable of capturing the similarity and aligns with the complicated CKA. Our experimental results on common transformers reveal that representations across layers are positively correlated, with similarity increasing when layers get closer. We provide a theoretical justification for this phenomenon…
Peer Reviews
Decision·ICLR 2025 Poster
- The manuscript is logically organized, including the observations and its applications. - Both empirical and theoretical justifications are provided. - The study covers both the vision and language domains.
- Line 342 “To the best of our knowledge, our work is the first to show that one common classifier is sufficient for multi-exit models.” This is not true, a lot of early exiting methods can do with a single classifier heads [1-3]. - Line 240: “Progressively increasing layer-wise representation similarity”, thus observations might be different in other domain[2], is there any insights why autoregressive models seems not have progressively increasing layer-wise representation similarity? - Missing
In general, the proposed paper is well-written, the author provides detailed experiments, extensive theoretical analysis to analysis the feature pattern in Transformers. Based on these analyses, the author proposes a aligned training method for enhancing shallow layer performance. The proposed method achieves performance gain and speed boost on CV and NLP tasks.
In general, I think the proposed paper is well formulated and written. However, I still have some concerns about the paper: 1. I don't think it is useful for enhancing similarity between all Transformer outputs (representations), it may help in simple tasks like image classification and sentence classification. But for complex tasks (like object detection/semantic segmentations), we may not want all representations to be similar. I think similar tasks may exist in NLP tasks (like parsing), Then
1. By analyzing the similarity of representations among different layers, the paper demonstrates the possibility of early saturation events and shared classifier among different layers. 2. They further presents a training strategy to improve effectiveness of shallow layers, such that they can enjoy more early saturation events, minimal depth, and so on 3. Some analysis in this paper is insightful, e.g., the shadow layer is able to achieve approaching performance with the depth layer. This might
1. a small weakness is that the previous approaches have observed this phenomenon that representations in the early layers can also achieve reasonable classifiers, though I think this is a tiny issue. Please refer to the Questions.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Machine Learning and Algorithms
MethodsSparse Evolutionary Training · Early exiting using confidence measures · Focus
