All or None: Identifiable Linear Properties of Next-token Predictors in   Language Modeling

Emanuele Marconato; S\'ebastien Lachapelle; Sebastian Weichwald; and Luigi Gresele

arXiv:2410.23501·stat.ML·March 18, 2025

All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling

Emanuele Marconato, S\'ebastien Lachapelle, Sebastian Weichwald, and Luigi Gresele

PDF

Open Access

TL;DR

This paper investigates whether linear properties in language models are inherently identifiable across models with the same output distribution, providing theoretical results that show such properties are either universally present or absent in distribution-equivalent predictors.

Contribution

The paper introduces a new identifiability theorem for distribution-equivalent next-token predictors and extends the analysis of linear properties to a broader class of linear notions in language models.

Findings

01

Linear properties are either present in all or none of the distribution-equivalent models.

02

The authors prove an identifiability result that relaxes previous diversity requirements.

03

Many notions of linearity in language models are analyzable under their framework.

Abstract

We analyze identifiability as a possible explanation for the ubiquity of linear properties across language models, such as the vector difference between the representations of "easy" and "easiest" being parallel to that between "lucky" and "luckiest". For this, we ask whether finding a linear property in one model implies that any model that induces the same distribution has that property, too. To answer that, we first prove an identifiability result to characterize distribution-equivalent next-token predictors, lifting a diversity requirement of previous results. Second, based on a refinement of relational linearity [Paccanaro and Hinton, 2001; Hernandez et al., 2024], we show how many notions of linearity are amenable to our analysis. Finally, we show that under suitable conditions, these linear properties either hold in all or none distribution-equivalent next-token predictors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling