Excess Description Length of Learning Generalizable Predictors
Elizabeth Donoway, Hailey Joren, Fabien Roger, Jan Leike

TL;DR
This paper introduces an information-theoretic measure called Excess Description Length (EDL) to quantify how much predictive structure fine-tuning extracts from data, providing insights into learning, generalization, and capability elicitation in language models.
Contribution
It develops a formal framework using EDL to analyze the information learned during fine-tuning and clarifies common misconceptions about structure, learning, and generalization in models.
Findings
EDL is non-negative and converges in the infinite-data limit.
Random labels produce near-zero EDL, indicating no learned structure.
Format learning causes early transient effects distinct from capability acquisition.
Abstract
Understanding whether fine-tuning elicits latent capabilities or teaches new ones is a fundamental question for language model evaluation and safety. We develop a formal information-theoretic framework for quantifying how much predictive structure fine-tuning extracts from the train dataset and writes into a model's parameters. Our central quantity, Excess Description Length (EDL), is defined via prequential coding and measures the gap between the bits required to encode training labels sequentially using an evolving model (trained online) and the residual encoding cost under the final trained model. We establish that EDL is non-negative in expectation, converges to surplus description length in the infinite-data limit, and provides bounds on expected generalization gain. Through a series of toy models, we clarify common confusions about information in learning: why random labels yield…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms
