Universal Response and Emergence of Induction in LLMs

Niclas Luick

arXiv:2411.07071·cs.LG·November 12, 2024

Universal Response and Emergence of Induction in LLMs

Niclas Luick

PDF

Open Access

TL;DR

This paper investigates how large language models exhibit induction behavior by analyzing their responses to small perturbations, revealing a universal, scale-invariant response pattern that develops across layers and models.

Contribution

It introduces a novel method to probe induction signatures in LLMs, demonstrating their emergence across different models and layers, and provides a benchmark for circuit analysis.

Findings

01

Induction signatures appear gradually in intermediate layers.

02

Responses are scale-invariant under perturbation strength.

03

Signatures are observed in multiple large models.

Abstract

While induction is considered a key mechanism for in-context learning in LLMs, understanding its precise circuit decomposition beyond toy models remains elusive. Here, we study the emergence of induction behavior within LLMs by probing their response to weak single-token perturbations of the residual stream. We find that LLMs exhibit a robust, universal regime in which their response remains scale-invariant under changes in perturbation strength, thereby allowing us to quantify the build-up of token correlations throughout the model. By applying our method, we observe signatures of induction behavior within the residual stream of Gemma-2-2B, Llama-3.2-3B, and GPT-2-XL. Across all models, we find that these induction signatures gradually emerge within intermediate layers and identify the relevant model sections composing this behavior. Our results provide insights into the collective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMagnetic confinement fusion research