Information Locality as an Inductive Bias for Neural Language Models

Taiga Someya; Anej Svete; Brian DuSell; Timothy J. O'Donnell; Mario Giulianelli; Ryan Cotterell

arXiv:2506.05136·cs.CL·June 6, 2025

Information Locality as an Inductive Bias for Neural Language Models

Taiga Someya, Anej Svete, Brian DuSell, Timothy J. O'Donnell, Mario Giulianelli, Ryan Cotterell

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new information-theoretic measure called m-local entropy to analyze how neural language models utilize local context, revealing their sensitivity to local statistical structures similar to human language processing.

Contribution

The paper proposes a quantitative framework and a novel measure, m-local entropy, to investigate the inductive biases of neural language models regarding local language structure.

Findings

01

Languages with higher m-local entropy are harder for neural LMs to learn.

02

Neural LMs are highly sensitive to local statistical structures.

03

The framework enables controlled studies of language model biases.

Abstract

Inductive biases are inherent in every machine learning system, shaping how models generalize from finite data. In the case of neural language models (LMs), debates persist as to whether these biases align with or diverge from human processing constraints. To address this issue, we propose a quantitative framework that allows for controlled investigations into the nature of these biases. Within our framework, we introduce $m$ -local entropy $\unicode x 2013$ an information-theoretic measure derived from average lossy-context surprisal $\unicode x 2013$ that captures the local uncertainty of a language by quantifying how effectively the $m - 1$ preceding symbols disambiguate the next symbol. In experiments on both perturbed natural language corpora and languages defined by probabilistic finite-state automata (PFSAs), we show that languages with higher $m$ -local entropy are more difficult for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rycolab/lm-inductive-bias
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Topic Modeling · Natural Language Processing Techniques

MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · Sigmoid Activation · ALIGN