Persistent Topology of Syntax
Alexander Port, Iulia Gheorghita, Daniel Guth, John M.Clark, Crystal, Liang, Shival Dasu, Matilde Marcolli

TL;DR
This paper applies persistent homology to syntactic data of world languages, revealing family-specific topological features that may have historical linguistic significance.
Contribution
It introduces the use of persistent homology to analyze syntactic parameters across language families, uncovering family-specific topological patterns.
Findings
Persistent homology varies across language families.
Indo-European shows a persistent first homology related to Greek.
Homology behavior is erratic over the entire dataset.
Abstract
We study the persistent homology of the data set of syntactic parameters of the world languages. We show that, while homology generators behave erratically over the whole data set, non-trivial persistent homology appears when one restricts to specific language families. Different families exhibit different persistent homology. We focus on the cases of the Indo-European and the Niger-Congo families, for which we compare persistent homology over different cluster filtering values. We investigate the possible significance, in historical linguistic terms, of the presence of persistent generators of the first homology. In particular, we show that the persistent first homology generator we find in the Indo-European family is not due (as one might guess) to the Anglo-Norman bridge in the Indo-European phylogenetic network, but is related to the position of Ancient Greek and the Hellenic branch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
