Revisiting the Uniform Information Density Hypothesis
Clara Meister, Tiago Pimentel, Patrick Haller, Lena J\"ager, Ryan, Cotterell, Roger Levy

TL;DR
This paper investigates the uniform information density hypothesis in language, examining its implications for comprehension and acceptability, and explores how uniformity should be measured across different linguistic units.
Contribution
It provides empirical analysis of UID using reading time and acceptability data, and compares multiple operationalizations to clarify how uniformity influences language processing.
Findings
Reading time data aligns with a weakly super-linear surprisal effect.
Non-uniformity predicts lower acceptability in judgments.
Uniformity may be best understood as a regression towards mean surprisal across language.
Abstract
The uniform information density (UID) hypothesis posits a preference among language users for utterances structured such that information is distributed uniformly across a signal. While its implications on language production have been well explored, the hypothesis potentially makes predictions about language comprehension and linguistic acceptability as well. Further, it is unclear how uniformity in a linguistic signal -- or lack thereof -- should be measured, and over which linguistic unit, e.g., the sentence or language level, this uniformity should hold. Here we investigate these facets of the UID hypothesis using reading time and acceptability data. While our reading time results are generally consistent with previous work, they are also consistent with a weakly super-linear effect of surprisal, which would be compatible with UID's predictions. For acceptability judgments, we find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
