Understanding and Controlling Repetition Neurons and Induction Heads in In-Context Learning

Nhi Hoai Doan; Tatsuya Hiraoka; Kentaro Inui

arXiv:2507.07810·cs.CL·November 12, 2025

Understanding and Controlling Repetition Neurons and Induction Heads in In-Context Learning

Nhi Hoai Doan, Tatsuya Hiraoka, Kentaro Inui

PDF

Open Access

TL;DR

This paper explores how repetition neurons and induction heads in large language models influence in-context learning, revealing layer-dependent effects and strategies to reduce repetition without impairing learning performance.

Contribution

It introduces a focus on skill neurons, especially repetition neurons, and compares their effects with induction heads to improve understanding and control of ICL behavior.

Findings

01

Repetition neurons' impact varies with layer depth.

02

Strategies can reduce repetition while preserving ICL performance.

03

Comparison of repetition neurons and induction heads offers new insights.

Abstract

This paper investigates the relationship between large language models' (LLMs) ability to recognize repetitive input patterns and their performance on in-context learning (ICL). In contrast to prior work that has primarily focused on attention heads, we examine this relationship from the perspective of skill neurons, specifically repetition neurons. Our experiments reveal that the impact of these neurons on ICL performance varies depending on the depth of the layer in which they reside. By comparing the effects of repetition neurons and induction heads, we further identify strategies for reducing repetitive outputs while maintaining strong ICL capabilities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAction Observation and Synchronization · Motor Control and Adaptation · Domain Adaptation and Few-Shot Learning