Modeling the dynamics of language change: logistic regression, Piotrowski's law, and a handful of examples in Polish
Rafa{\l} L. G\'orski, Maciej Eder

TL;DR
This paper applies logistic regression to model historical language changes in Polish from the 15th to 18th centuries, extending Piotrowski's law with polynomial models and analyzing multiple change cases jointly.
Contribution
It introduces polynomial logistic regression for language change modeling and evaluates the influence of corpus size on model accuracy.
Findings
Most language changes follow expected nonlinear patterns
Polynomial logistic regression improves model fit for complex changes
Subcorpus size affects the goodness-of-fit of the models
Abstract
The study discusses modeling diachronic processes by logistic regression. The phenomenon of nonlinear changes in language was first observed by Raimund Piotrowski (hence labelled as Piotrowski's law), even if actual linguistic evidence usually speaks against using the notion of a "law" in this context. In our study, we apply logistic regression models to 9 changes which occurred between 15th and 18th century in the Polish language. The attested course of the majority of these changes closely follow the expected values, which proves that the language change might indeed resemble a nonlinear phase change scenario. We also extend the original Piotrowski's approach by proposing polynomial logistic regression for these cases which can hardly be described by its standard version. Also, we propose to consider individual language change cases jointly, in order to inspect their possible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and Culture · Bayesian Methods and Mixture Models
MethodsLogistic Regression
