The Generalization Ridge: Information Flow in Natural Language Generation
Ruidi Chang, Chunyuan Deng, Hanjie Chen

TL;DR
This paper introduces InfoRidge, an information-theoretic framework revealing that in transformer-based language models, predictive information peaks in intermediate layers, forming a generalization ridge before declining in final layers, highlighting the importance of intermediate layers for generalization.
Contribution
The study uncovers a non-monotonic pattern of information flow in transformers, demonstrating the emergence and propagation of generalization capabilities across layers during training.
Findings
Predictive information peaks in intermediate layers forming a generalization ridge.
The ridge phenomenon persists across multiple decoding steps.
Intermediate layers play a critical role in model generalization.
Abstract
Transformer-based language models have achieved state-of-the-art performance in natural language generation (NLG), yet their internal mechanisms for synthesizing task-relevant information remain insufficiently understood. While prior studies suggest that intermediate layers often yield more generalizable representations than final layers, how this generalization ability emerges and propagates across layers during training remains unclear. We propose InfoRidge, an information-theoretic framework, to characterize how predictive information-the mutual information between hidden representations and target outputs-varies across depth during training. Our experiments across various models and datasets reveal a consistent non-monotonic trend: predictive information peaks in intermediate layers-forming a generalization ridge-before declining in final layers, reflecting a transition between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
