Hierarchical Latent Structures in Data Generation Process Unify Mechanistic Phenomena across Scale
Jonas Rohweder, Subhabrata Dutta, Iryna Gurevych

TL;DR
This paper introduces a hierarchical data generation framework using probabilistic context-free grammars to explain the emergence of complex phenomena in language models, bridging theoretical understanding and empirical observations.
Contribution
It presents a novel synthetic data generation approach that captures hierarchical structures, providing a unified explanation for multiple phenomena in language models.
Findings
Hierarchical data structures explain phenomena like induction heads and Hydra effect.
Synthetic corpora replicate phenomena observed in real language models.
Hierarchy in data generation influences training dynamics of language models.
Abstract
Contemporary studies have uncovered many puzzling phenomena in the neural information processing of Transformer-based language models. Building a robust, unified understanding of these phenomena requires disassembling a model within the scope of its training. While the intractable scale of pretraining corpora limits a bottom-up investigation in this direction, simplistic assumptions of the data generation process limit the expressivity and fail to explain complex patterns. In this work, we use probabilistic context-free grammars (PCFGs) to generate synthetic corpora that are faithful and computationally efficient proxies for web-scale text corpora. We investigate the emergence of three mechanistic phenomena: induction heads, function vectors, and the Hydra effect, under our designed data generation process, as well as in the checkpoints of real-world language models. Our findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Natural Language Processing Techniques
