Multi-Dialectal Representation Learning of Sinitic Phonology
Zhibai Jia

TL;DR
This paper introduces a novel method for creating multi-dialectal Sinitic phonology representations using knowledge graphs and the BoxE technique, enabling phonological comparison, inference, and reconstruction of proto-languages.
Contribution
It presents a new approach combining knowledge graph construction and BoxE for representing Sinitic dialects, facilitating phonological analysis and historical language reconstruction.
Findings
Representations capture phonemic contrasts across dialects.
Classifiers successfully infer unobserved Middle Chinese labels.
Potential applications include knowledge base completion and archaic feature reconstruction.
Abstract
Machine learning techniques have shown their competence for representing and reasoning in symbolic systems such as language and phonology. In Sinitic Historical Phonology, notable tasks that could benefit from machine learning include the comparison of dialects and reconstruction of proto-languages systems. Motivated by this, this paper provides an approach for obtaining multi-dialectal representations of Sinitic syllables, by constructing a knowledge graph from structured phonological data, then applying the BoxE technique from knowledge base learning. We applied unsupervised clustering techniques to the obtained representations to observe that the representations capture phonemic contrast from the input dialects. Furthermore, we trained classifiers to perform inference of unobserved Middle Chinese labels, showing the representations' potential for indicating archaic, proto-language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Phonetics and Phonology Research
MethodsBalanced Selection
