Universal Response and Emergence of Induction in LLMs
Niclas Luick

TL;DR
This paper investigates how large language models exhibit induction behavior by analyzing their responses to small perturbations, revealing a universal, scale-invariant response pattern that develops across layers and models.
Contribution
It introduces a novel method to probe induction signatures in LLMs, demonstrating their emergence across different models and layers, and provides a benchmark for circuit analysis.
Findings
Induction signatures appear gradually in intermediate layers.
Responses are scale-invariant under perturbation strength.
Signatures are observed in multiple large models.
Abstract
While induction is considered a key mechanism for in-context learning in LLMs, understanding its precise circuit decomposition beyond toy models remains elusive. Here, we study the emergence of induction behavior within LLMs by probing their response to weak single-token perturbations of the residual stream. We find that LLMs exhibit a robust, universal regime in which their response remains scale-invariant under changes in perturbation strength, thereby allowing us to quantify the build-up of token correlations throughout the model. By applying our method, we observe signatures of induction behavior within the residual stream of Gemma-2-2B, Llama-3.2-3B, and GPT-2-XL. Across all models, we find that these induction signatures gradually emerge within intermediate layers and identify the relevant model sections composing this behavior. Our results provide insights into the collective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMagnetic confinement fusion research
