On Structured Sparsity of Phonological Posteriors for Linguistic Parsing
Milos Cernak, Afsaneh Asaei, Herv\'e Bourlard

TL;DR
This paper explores the structured sparsity of phonological posteriors derived from speech signals, demonstrating that these sparse representations encode supra-segmental linguistic information like syllabic and prosodic events, enabling accurate linguistic parsing.
Contribution
It introduces a novel binary sparsity structure of phonological posteriors and shows its effectiveness for supra-segmental linguistic event classification.
Findings
Phonological posteriors are sparse and encode supra-segmental info.
Binary pattern matching of sparsity structures achieves high parsing accuracy.
High-order structures improve linguistic event classification.
Abstract
The speech signal conveys information on different time scales from short time scale or segmental, associated to phonological and phonetic information to long time scale or supra segmental, associated to syllabic and prosodic information. Linguistic and neurocognitive studies recognize the phonological classes at segmental level as the essential and invariant representations used in speech temporal organization. In the context of speech processing, a deep neural network (DNN) is an effective computational method to infer the probability of individual phonological classes from a short segment of speech signal. A vector of all phonological class probabilities is referred to as phonological posterior. There are only very few classes comprising a short term speech signal; hence, the phonological posterior is a sparse vector. Although the phonological posteriors are estimated at segmental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
