Interpretable Next-token Prediction via the Generalized Induction Head

Eunji Kim; Sriya Mantena; Weiwei Yang; Chandan Singh; Sungroh Yoon; Jianfeng Gao

arXiv:2411.00066·cs.CL·October 31, 2025

Interpretable Next-token Prediction via the Generalized Induction Head

Eunji Kim, Sriya Mantena, Weiwei Yang, Chandan Singh, Sungroh Yoon, Jianfeng Gao

PDF

Open Access 1 Repo 1 Video

TL;DR

The paper introduces the Generalized Induction-Head Model (GIM), an interpretable next-token prediction model inspired by induction heads, which improves performance and interpretability in language modeling and neural response prediction.

Contribution

GIM is a novel retrieval-based, interpretable model that combines exact and fuzzy sequence matching, bridging the gap between interpretability and performance in language tasks.

Findings

01

GIM improves next-token prediction by up to 25% over interpretable baselines.

02

GIM enhances neural response prediction by 20%.

03

GIM offers insights into language selectivity of the brain.

Abstract

While large transformer models excel in predictive performance, their lack of interpretability restricts their usefulness in high-stakes domains. To remedy this, we propose the Generalized Induction-Head Model (GIM), an interpretable model for next-token prediction inspired by the observation of "induction heads" in LLMs. GIM is a retrieval-based module that identifies similar sequences in the input context by combining exact n-gram matching and fuzzy matching based on a neural similarity metric. We evaluate GIM in two settings: language modeling and fMRI response prediction. In language modeling, GIM improves next-token prediction by up to 25%p over interpretable baselines, significantly narrowing the gap with black-box LLMs. In an fMRI setting, GIM improves neural response prediction by 20% and offers insights into the language selectivity of the brain. GIM represents a significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ejkim47/induction-gram
pytorchOfficial

Videos

Interpretable Next-token Prediction via the Generalized Induction Head· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings