Languages cool as they expand: Allometric scaling and the decreasing need for new words
Alexander M. Petersen, Joel N. Tenenbaum, Shlomo Havlin, H. Eugene, Stanley, Matjaz Perc

TL;DR
This study analyzes large-scale linguistic data over two centuries, revealing that as languages expand, the need for new words decreases, and language evolution slows down, following specific allometric scaling laws.
Contribution
It demonstrates a decreasing marginal need for new words in expanding languages and introduces a dynamic 'cooling' pattern in linguistic evolution based on large corpus analysis.
Findings
Confirmation of two scaling regimes in word frequency distributions
Demonstration of decreasing marginal need for new words with language growth
Identification of a slowdown in linguistic evolution over time
Abstract
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This "cooling pattern" forms the basis of a third statistical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
