Repetition Neurons: How Do Language Models Produce Repetitions?

Tatsuya Hiraoka; Kentaro Inui

arXiv:2410.13497·cs.CL·February 21, 2025

Repetition Neurons: How Do Language Models Produce Repetitions?

Tatsuya Hiraoka, Kentaro Inui

PDF

Open Access 1 Video

TL;DR

This paper investigates repetition neurons in language models, showing they are skill neurons that activate more strongly during repeated text generation, revealing a mechanism behind the repetition problem.

Contribution

It introduces the concept of repetition neurons, identifies them in multiple language models, and analyzes their activation patterns across languages.

Findings

01

Repetition neurons activate more strongly during repeated text generation.

02

Similar patterns of repetition neurons are observed across different language models.

03

Repetition neurons perceive repetition as a task to copy previous context.

Abstract

This paper introduces repetition neurons, regarded as skill neurons responsible for the repetition problem in text generation tasks. These neurons are progressively activated more strongly as repetition continues, indicating that they perceive repetition as a task to copy the previous context repeatedly, similar to in-context learning. We identify these repetition neurons by comparing activation values before and after the onset of repetition in texts generated by recent pre-trained language models. We analyze the repetition neurons in three English and one Japanese pre-trained language models and observe similar patterns across them.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Repetition Neurons: How Do Language Models Produce Repetitions?· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling