A probabilistic assessment of the Indo-Aryan Inner-Outer Hypothesis
Chundra A. Cathcart

TL;DR
This paper introduces a Bayesian hierarchical model to evaluate the Indo-Aryan Inner-Outer hypothesis, revealing dialect groupings and a core-periphery pattern consistent with the hypothesis using a novel probabilistic approach.
Contribution
It applies a new probabilistic Bayesian model with innovative priors to linguistics, providing quantitative evidence for the Inner-Outer hypothesis.
Findings
Identifies cohesive dialect groups in Indo-Aryan languages.
Supports a core-periphery distribution consistent with the hypothesis.
Demonstrates the effectiveness of logistic normal priors in linguistic modeling.
Abstract
This paper uses a novel data-driven probabilistic approach to address the century-old Inner-Outer hypothesis of Indo-Aryan. I develop a Bayesian hierarchical mixed-membership model to assess the validity of this hypothesis using a large data set of automatically extracted sound changes operating between Old Indo-Aryan and Modern Indo-Aryan speech varieties. I employ different prior distributions in order to model sound change, one of which, the logistic normal distribution, has not received much attention in linguistics outside of Natural Language Processing, despite its many attractive features. I find evidence for cohesive dialect groups that have made their imprint on contemporary Indo-Aryan languages, and find that when a logistic normal prior is used, the distribution of dialect components across languages is largely compatible with a core-periphery pattern similar to that proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
