A probabilistic assessment of the Indo-Aryan Inner-Outer Hypothesis

Chundra A. Cathcart

arXiv:1912.01957·cs.CL·December 5, 2019

A probabilistic assessment of the Indo-Aryan Inner-Outer Hypothesis

Chundra A. Cathcart

PDF

TL;DR

This paper introduces a Bayesian hierarchical model to evaluate the Indo-Aryan Inner-Outer hypothesis, revealing dialect groupings and a core-periphery pattern consistent with the hypothesis using a novel probabilistic approach.

Contribution

It applies a new probabilistic Bayesian model with innovative priors to linguistics, providing quantitative evidence for the Inner-Outer hypothesis.

Findings

01

Identifies cohesive dialect groups in Indo-Aryan languages.

02

Supports a core-periphery distribution consistent with the hypothesis.

03

Demonstrates the effectiveness of logistic normal priors in linguistic modeling.

Abstract

This paper uses a novel data-driven probabilistic approach to address the century-old Inner-Outer hypothesis of Indo-Aryan. I develop a Bayesian hierarchical mixed-membership model to assess the validity of this hypothesis using a large data set of automatically extracted sound changes operating between Old Indo-Aryan and Modern Indo-Aryan speech varieties. I employ different prior distributions in order to model sound change, one of which, the logistic normal distribution, has not received much attention in linguistics outside of Natural Language Processing, despite its many attractive features. I find evidence for cohesive dialect groups that have made their imprint on contemporary Indo-Aryan languages, and find that when a logistic normal prior is used, the distribution of dialect components across languages is largely compatible with a core-periphery pattern similar to that proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.