Is language evolution grinding to a halt? The scaling of lexical turbulence in English fiction suggests it is not
Eitan Adam Pechenick, Christopher M. Danforth, Peter Sheridan Dodds

TL;DR
This study investigates the long-term dynamics of English vocabulary in fiction, revealing persistent lexical turbulence and challenging previous notions of language stagnation, using a robust flux-based analysis of the Google Books corpus.
Contribution
The paper introduces a new method for analyzing lexical evolution through word flux across frequency thresholds, demonstrating ongoing lexical turbulence in English fiction.
Findings
Lexical flux scales superlinearly with word rank
Overall Zipf distribution remains stable over time
Evidence of persistent lexical turbulence in English fiction
Abstract
Of basic interest is the quantification of the long term growth of a language's lexicon as it develops to more completely cover both a culture's communication requirements and knowledge space. Here, we explore the usage dynamics of words in the English language as reflected by the Google Books 2012 English Fiction corpus. We critique an earlier method that found decreasing birth and increasing death rates of words over the second half of the 20th Century, showing death rates to be strongly affected by the imposed time cutoff of the arbitrary present and not increasing dramatically. We provide a robust, principled approach to examining lexical evolution by tracking the volume of word flux across various relative frequency thresholds. We show that while the overall statistical structure of the English language remains stable over time in terms of its raw Zipf distribution, we find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
