Computational Historical Linguistics

Gerhard J\"ager

arXiv:1805.08099·cs.CL·May 22, 2018

Computational Historical Linguistics

Gerhard J\"ager

PDF

1 Repo

TL;DR

This paper reviews recent advances in computational historical linguistics, highlighting methods like genetic relatedness assessment, cognate detection, and phylogenetic inference, demonstrated through reconstructing Proto-Romance words from modern languages.

Contribution

It introduces key research topics in computational historical linguistics and demonstrates their application in reconstructing ancestral language data.

Findings

01

Successful automatic reconstruction of Proto-Romance word list

02

Enhanced methods for genetic relatedness and cognate detection

03

Phylogenetic inference applied to Romance languages

Abstract

Computational approaches to historical linguistics have been proposed since half a century. Within the last decade, this line of research has received a major boost, owing both to the transfer of ideas and software from computational biology and to the release of several large electronic data resources suitable for systematic comparative work. In this article, some of the central research topic of this new wave of computational historical linguistics are introduced and discussed. These are automatic assessment of genetic relatedness, automatic cognate detection, phylogenetic inference and ancestral state reconstruction. They will be demonstrated by means of a case study of automatically reconstructing a Proto-Romance word list from lexical data of 50 modern Romance languages and dialects.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gerhardJaeger/protoRomance
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.