On Support Samples of Next Word Prediction

Yuqian Li; Yupei Du; Yufang Liu; Feifei Feng; Mou Xiao Feng; Yuanbin Wu

arXiv:2506.04047·cs.CL·June 10, 2025

On Support Samples of Next Word Prediction

Yuqian Li, Yupei Du, Yufang Liu, Feifei Feng, Mou Xiao Feng, Yuanbin Wu

PDF

Open Access 1 Video

TL;DR

This paper explores the role of support samples in next-word prediction, revealing their intrinsic properties and significance in model generalization, with implications for interpretability and training strategies.

Contribution

It introduces a data-centric interpretability framework using the representer theorem to identify and analyze support samples in language models.

Findings

01

Support samples are intrinsically identifiable before training.

02

Non-support samples influence generalization and prevent overfitting.

03

Their importance increases in deeper model layers.

Abstract

Language models excel in various tasks by making complex decisions, yet understanding the rationale behind these decisions remains a challenge. This paper investigates \emph{data-centric interpretability} in language models, focusing on the next-word prediction task. Using representer theorem, we identify two types of \emph{support samples}-those that either promote or deter specific predictions. Our findings reveal that being a support sample is an intrinsic property, predictable even before training begins. Additionally, while non-support samples are less influential in direct predictions, they play a critical role in preventing overfitting and shaping generalization and representation learning. Notably, the importance of non-support samples increases in deeper layers, suggesting their significant role in intermediate representation formation. These insights shed light on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On Support Samples of Next Word Prediction· underline

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Topic Modeling