Non-Parametric Bayesian Areal Linguistics
Hal Daum\'e III

TL;DR
This paper introduces a non-parametric Bayesian model for linguistic areal and phylogenetic analysis, improving language classification and feature hierarchy identification using advanced probabilistic methods.
Contribution
It presents a novel Bayesian framework combining Pitman-Yor process and Kingman's coalescent for modeling linguistic areas and phylogeny, enhancing language reconstruction.
Findings
Accurately recovers known linguistic areas
Identifies plausible hierarchy of areal features
Improves genetic reconstruction of languages
Abstract
We describe a statistical model over linguistic areas and phylogeny. Our model recovers known areas and identifies a plausible hierarchy of areal features. The use of areas improves genetic reconstruction of languages both qualitatively and quantitatively according to a variety of metrics. We model linguistic areas by a Pitman-Yor process and linguistic phylogeny by Kingman's coalescent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Algorithms and Data Compression · Data Management and Algorithms
