A simple branching model that reproduces language family and language population distributions
V. Schw\"ammle, P. M. C. de Oliveira

TL;DR
This paper introduces a simple stochastic model of language evolution that reproduces observed language family and population distributions by assuming independent language changes and a finite set of distinguishing features.
Contribution
The model is novel in combining independent change dynamics with a finite feature set to accurately replicate language distribution patterns.
Findings
Model matches real language family and population data
Language differences can be characterized by a small set of features
Independent change assumption aligns with observed language evolution patterns
Abstract
Human history leaves fingerprints in human languages. Little is known over language evolution and its study is of great importance. Here, we construct a simple stochastic model and compare its results to statistical data of real languages. The model bases on the recent findings that language changes occur independently on the population size. We find agreement with the data additionally assuming that languages may be distinguished by having at least one among a finite, small number of different features. This finite set is used also in order to define the distance between two languages, similarly to linguistics tradition since Swadesh.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
