Identifying trends in word frequency dynamics
Eduardo G. Altmann, Zakary L. Whichard, Adilson E. Motter

TL;DR
This paper introduces a model to distinguish between temporary and persistent increases in word frequency, revealing the relationship between dissemination and frequency changes in language evolution.
Contribution
The paper presents a novel model that differentiates short-term from long-term frequency changes in words, based on large-scale linguistic data.
Findings
Strong relation between dissemination and frequency changes
Short-term survival is crucial for long-term word persistence
Model applied to large online and digitized book corpora
Abstract
The word-stock of a language is a complex dynamical system in which words can be created, evolve, and become extinct. Even more dynamic are the short-term fluctuations in word usage by individuals in a population. Building on the recent demonstration that word niche is a strong determinant of future rise or fall in word frequency, here we introduce a model that allows us to distinguish persistent from temporary increases in frequency. Our model is illustrated using a 10^8-word database from an online discussion group and a 10^11-word collection of digitized books. The model reveals a strong relation between changes in word dissemination and changes in frequency. Aside from their implications for short-term word frequency dynamics, these observations are potentially important for language evolution as new words must survive in the short term in order to survive in the long term.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
