Computational Induction of Prosodic Structure
Dafydd Gibbon

TL;DR
This paper introduces an inductive, language-independent method for analyzing prosodic structure directly from speech signals, advancing the understanding of rhythm in speech across languages.
Contribution
It proposes the Rhythm Formant Theory and Analysis method, filling a gap by grounding prosodic analysis in physical signal data rather than language-specific models.
Findings
Differences in rhythm patterns between Mandarin and English were identified.
The method demonstrated validity through application to spoken Mandarin.
Language-internal factors influence prosodic rhythm more than language differences.
Abstract
The present study has two goals relating to the grammar of prosody, understood as the rhythms and melodies of speech. First, an overview is provided of the computable grammatical and phonetic approaches to prosody analysis which use hypothetico-deductive methods and are based on learned hermeneutic intuitions about language. Second, a proposal is presented for an inductive grounding in the physical signal, in which prosodic structure is inferred using a language-independent method from the low-frequency spectrum of the speech signal. The overview includes a discussion of computational aspects of standard generative and post-generative models, and suggestions for reformulating these to form inductive approaches. Also included is a discussion of linguistic phonetic approaches to analysis of annotations (pairs of speech unit labels with time-stamps) of recorded spoken utterances. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Speech Recognition and Synthesis · Language and cultural evolution
