# Ab Antiquo: Neural Proto-language Reconstruction

**Authors:** Carlo Meloni, Shauli Ravfogel, Yoav Goldberg

arXiv: 1908.02477 · 2021-05-11

## TL;DR

This paper introduces a neural approach to reconstruct proto-words from contemporary cognates, demonstrating improved accuracy over traditional methods and revealing meaningful phonological generalizations learned by the models.

## Contribution

The paper presents a novel neural sequence model for proto-language reconstruction, along with a new dataset and analysis of phonological learning capabilities.

## Key findings

- Neural models outperform conventional reconstruction methods.
- Models learn phonologically meaningful generalizations.
- Error variability correlates with phonological change complexity.

## Abstract

Historical linguists have identified regularities in the process of historic sound change. The comparative method utilizes those regularities to reconstruct proto-words based on observed forms in daughter languages. Can this process be efficiently automated? We address the task of proto-word reconstruction, in which the model is exposed to cognates in contemporary daughter languages, and has to predict the proto word in the ancestor language. We provide a novel dataset for this task, encompassing over 8,000 comparative entries, and show that neural sequence models outperform conventional methods applied to this task so far. Error analysis reveals variability in the ability of neural model to capture different phonological changes, correlating with the complexity of the changes. Analysis of learned embeddings reveals the models learn phonologically meaningful generalizations, corresponding to well-attested phonological shifts documented by historical linguistics.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.02477/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1908.02477/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1908.02477/full.md

---
Source: https://tomesphere.com/paper/1908.02477