Tensor models for linguistics pitch curve data of native speakers of Afrikaans
Michael Hornstein, Shuheng Zhou, Kerby Shedden

TL;DR
This paper applies tensor analysis to high-dimensional linguistics pitch data from Afrikaans speakers, revealing relationships between phonetic features and word properties through a graphical model approach.
Contribution
It introduces a tensor modeling framework with Kronecker structured inverse covariance for linguistic pitch data, highlighting relationships between phonetic features and word properties.
Findings
Vowel front/back clustering based on pitch curves
Strong edges linked to initial consonants in short vowels
Graphical model reveals phonetic property relationships
Abstract
We use tensor analysis techniques for high-dimensional data to gain insight into pitch curves, which play an important role in linguistics research. In particular, we propose that demeaned phonetics pitch curve data can be modeled as having a Kronecker product inverse covariance structure with sparse factors corresponding to words and time. Using data from a study of native Afrikaans speakers, we show that by targeting conditional independence through a graphical model, we reveal relationships associated with natural properties of words as studied by linguists. We find that words with long vowels cluster based on whether the vowel is pronounced at the front or back of the mouth, and words with short vowels have strong edges associated with the initial consonant.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Tensor decomposition and applications · Advanced Neuroimaging Techniques and Applications
