Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence

Gouki Minegishi; Hiroki Furuta; Shohei Taniguchi; Yusuke Iwasawa; Yutaka Matsuo

arXiv:2505.16694·cs.CL·June 11, 2025

Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence

Gouki Minegishi, Hiroki Furuta, Shohei Taniguchi, Yusuke Iwasawa, Yutaka Matsuo

PDF

Open Access 1 Video

TL;DR

This paper investigates how large language models develop in-context meta-learning abilities through multi-phase circuit emergence, extending understanding beyond induction heads and revealing complex training dynamics.

Contribution

The study introduces an extended copy task to analyze in-context meta-learning, revealing multiple phases and unique circuit emergence during training, which deepens understanding of ICL mechanisms.

Findings

01

Multiple training phases with distinct circuit emergence.

02

Circuit development correlates with meta-learning ability.

03

Insights into the dynamics of transformer training processes.

Abstract

Transformer-based language models exhibit In-Context Learning (ICL), where predictions are made adaptively based on context. While prior work links induction heads to ICL through a sudden jump in accuracy, this can only account for ICL when the answer is included within the context. However, an important property of practical ICL in large language models is the ability to meta-learn how to solve tasks from context, rather than just copying answers from context; how such an ability is obtained during training is largely unexplored. In this paper, we experimentally clarify how such meta-learning ability is acquired by analyzing the dynamics of the model's circuit during training. Specifically, we extend the copy task from previous research into an In-Context Meta Learning setting, where models must infer a task from examples to answer queries. Interestingly, in this setting, we find that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence· slideslive

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Domain Adaptation and Few-Shot Learning