Repetition Neurons: How Do Language Models Produce Repetitions?
Tatsuya Hiraoka, Kentaro Inui

TL;DR
This paper investigates repetition neurons in language models, showing they are skill neurons that activate more strongly during repeated text generation, revealing a mechanism behind the repetition problem.
Contribution
It introduces the concept of repetition neurons, identifies them in multiple language models, and analyzes their activation patterns across languages.
Findings
Repetition neurons activate more strongly during repeated text generation.
Similar patterns of repetition neurons are observed across different language models.
Repetition neurons perceive repetition as a task to copy previous context.
Abstract
This paper introduces repetition neurons, regarded as skill neurons responsible for the repetition problem in text generation tasks. These neurons are progressively activated more strongly as repetition continues, indicating that they perceive repetition as a task to copy the previous context repeatedly, similar to in-context learning. We identify these repetition neurons by comparing activation values before and after the onset of repetition in texts generated by recent pre-trained language models. We analyze the repetition neurons in three English and one Japanese pre-trained language models and observe similar patterns across them.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
