In-context Learning and Induction Heads

Catherine Olsson; Nelson Elhage; Neel Nanda; Nicholas Joseph; Nova; DasSarma; Tom Henighan; Ben Mann; Amanda Askell; Yuntao Bai; Anna Chen; Tom; Conerly; Dawn Drain; Deep Ganguli; Zac Hatfield-Dodds; Danny Hernandez; Scott; Johnston; Andy Jones; Jackson Kernion; Liane Lovitt; Kamal Ndousse; Dario; Amodei; Tom Brown; Jack Clark; Jared Kaplan; Sam McCandlish; Chris Olah

arXiv:2209.11895·cs.LG·September 27, 2022·84 cites

In-context Learning and Induction Heads

Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova, DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom, Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott, Johnston, Andy Jones, Jackson Kernion, Liane Lovitt

PDF

Open Access

TL;DR

This paper hypothesizes that induction heads are the primary mechanism behind in-context learning in large transformer models, supported by evidence from models of varying sizes and development stages.

Contribution

It provides the first comprehensive analysis linking induction heads to in-context learning, with causal evidence for small models and correlational evidence for larger models.

Findings

01

Induction heads develop concurrently with in-context learning ability.

02

Induction heads are likely the mechanistic source of in-context learning.

03

Evidence spans small to large transformer models.

Abstract

"Induction heads" are attention heads that implement a simple algorithm to complete token sequences like [A][B] ... [A] -> [B]. In this work, we present preliminary and indirect evidence for a hypothesis that induction heads might constitute the mechanism for the majority of all "in-context learning" in large transformer models (i.e. decreasing loss at increasing token indices). We find that induction heads develop at precisely the same point as a sudden sharp increase in in-context learning ability, visible as a bump in the training loss. We present six complementary lines of evidence, arguing that induction heads may be the mechanistic source of general in-context learning in transformer models of any size. For small attention-only models, we present strong, causal evidence; for larger models with MLPs, we present correlational evidence.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Neural Networks and Applications · Machine Learning and ELM