Zipf's law in 50 languages: its structural pattern, linguistic interpretation, and cognitive motivation
Shuiyuan Yu, Chunshan Xu, Haitao Liu

TL;DR
This study investigates Zipf's law across 50 languages, revealing a universal three-segment pattern in word frequency distributions and suggesting a cognitive basis rooted in dual-process mechanisms.
Contribution
It provides the first large-scale cross-linguistic analysis of Zipf's law, identifying a universal structural pattern and linking it to cognitive processes through simulation.
Findings
All 50 languages exhibit a three-segment Zipf's law pattern.
The lower segment consistently deviates downward from theoretical predictions.
Cognitive dual-process theory can replicate the observed structural pattern.
Abstract
Zipf's law has been found in many human-related fields, including language, where the frequency of a word is persistently found as a power law function of its frequency rank, known as Zipf's law. However, there is much dispute whether it is a universal law or a statistical artifact, and little is known about what mechanisms may have shaped it. To answer these questions, this study conducted a large scale cross language investigation into Zipf's law. The statistical results show that Zipf's laws in 50 languages all share a 3-segment structural pattern, with each segment demonstrating distinctive linguistic properties and the lower segment invariably bending downwards to deviate from theoretical expectation. This finding indicates that this deviation is a fundamental and universal feature of word frequency distributions in natural languages, not the statistical error of low frequency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Authorship Attribution and Profiling · Opinion Dynamics and Social Influence
