Evolution of the lexicon: a probabilistic point of view
Maurizio Serva

TL;DR
This paper analyzes the probabilistic limits of estimating language divergence times using lexicon evolution, considering both word replacement and gradual lexical modifications, highlighting their impact on accuracy.
Contribution
It introduces a probabilistic framework that accounts for both word replacement and lexical modification, improving the understanding of language divergence estimation.
Findings
Limits on accuracy due to probabilistic nature of word replacement
Gradual lexical modifications significantly influence language evolution
Incorporating lexical modifications enhances temporal separation estimates
Abstract
The Swadesh approach for determining the temporal separation between two languages relies on the stochastic process of words replacement (when a complete new word emerges to represent a given concept). It is well known that the basic assumptions of the Swadesh approach are often unrealistic due to various contamination phenomena and misjudgments (horizontal transfers, variations over time and space of the replacement rate, incorrect assessments of cognacy relationships, presence of synonyms, and so on). All of this means that the results cannot be completely correct. More importantly, even in the unrealistic case that all basic assumptions are satisfied, simple mathematics places limits on the accuracy of estimating the temporal separation between two languages. These limits, which are purely probabilistic in nature and which are often neglected in lexicostatistical studies, are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
