Simple stochastic processes behind Menzerath's Law
Ji\v{r}\'i Mili\v{c}ka

TL;DR
This paper demonstrates that simple stochastic processes, modeled by bivariate log-normal distributions and Gaussian copulas, can effectively explain Menzerath's Law in language, aligning well with empirical data.
Contribution
It introduces a new stochastic model based on multiplicative changes and Gaussian copulas that better fits Menzerath's Law data than previous models.
Findings
Bivariate log-normal distribution models word length changes.
Gaussian copula improves model accuracy.
Models align well with empirical linguistic data.
Abstract
This paper revisits Menzerath's Law, also known as the Menzerath-Altmann Law, which models a relationship between the length of a linguistic construct and the average length of its constituents. Recent findings indicate that simple stochastic processes can display Menzerathian behaviour, though existing models fail to accurately reflect real-world data. If we adopt the basic principle that a word can change its length in both syllables and phonemes, where the correlation between these variables is not perfect and these changes are of a multiplicative nature, we get bivariate log-normal distribution. The present paper shows, that from this very simple principle, we obtain the classic Altmann model of the Menzerath-Altmann Law. If we model the joint distribution separately and independently from the marginal distributions, we can obtain an even more accurate model by using a Gaussian…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications
