In search of isoglosses: continuous and discrete language embeddings in   Slavic historical phonology

Chundra A. Cathcart; Florian Wandl

arXiv:2005.13575·cs.CL·May 29, 2020

In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology

Chundra A. Cathcart, Florian Wandl

PDF

1 Repo

TL;DR

This paper explores neural network models with various language embeddings to understand Slavic phonological changes over time, highlighting the Straight-Through model's superior accuracy and interpretability.

Contribution

It introduces and compares three types of language embeddings in neural models for historical phonology, revealing insights into language subgrouping and sound change.

Findings

01

Straight-Through model achieves highest accuracy

02

Sigmoid embeddings align with traditional Slavic subgroupings

03

Straight-Through embeddings encode semi-interpretable sound change information

Abstract

This paper investigates the ability of neural network architectures to effectively learn diachronic phonological generalizations in a multilingual setting. We employ models using three different types of language embedding (dense, sigmoid, and straight-through). We find that the Straight-Through model outperforms the other two in terms of accuracy, but the Sigmoid model's language embeddings show the strongest agreement with the traditional subgrouping of the Slavic languages. We find that the Straight-Through model has learned coherent, semi-interpretable information about sound change, and outline directions for future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chundrac/slav-dial
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.