Phylogenetic typology

Gerhard J\"ager; Johannes Wahle

arXiv:2103.10198·q-bio.PE·March 22, 2021

Phylogenetic typology

Gerhard J\"ager, Johannes Wahle

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new method for estimating linguistic variable frequencies that accounts for shared ancestry among languages, using phylogenetic data and Markov models to analyze global word-order correlations.

Contribution

It presents a novel approach combining phylogenetic inference and statistical modeling to analyze linguistic data across diverse language families and isolates.

Findings

01

Effective control for language relatedness in frequency estimation

02

Phylogenetic models reveal insights into word-order correlations

03

Method applicable to large and small language datasets

Abstract

In this article we propose a novel method to estimate the frequency distribution of linguistic variables while controlling for statistical non-independence due to shared ancestry. Unlike previous approaches, our technique uses all available data, from language families large and small as well as from isolates, while controlling for different degrees of relatedness on a continuous scale estimated from the data. Our approach involves three steps: First, distributions of phylogenies are inferred from lexical data. Second, these phylogenies are used as part of a statistical model to statistically estimate transition rates between parameter states. Finally, the long-term equilibrium of the resulting Markov process is computed. As a case study, we investigate a series of potential word-order correlations across the languages of the world.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gerhardJaeger/phylogeneticTypology
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Authorship Attribution and Profiling · Forensic and Genetic Research