
TL;DR
This paper explores the concept of multiple time domains in prosody processing, discussing various analysis methods including amplitude envelope modulation spectrum and oscillator models, to advance understanding of speech rhythm and prosodic features.
Contribution
It introduces a novel perspective on prosody by emphasizing multiple time domains and discusses innovative analysis techniques like AEMS and oscillator models for speech rhythm.
Findings
Amplitude envelope modulation spectrum reveals long-term prosodic patterns.
Oscillator models provide a unified framework for prosodic rhythm analysis.
Time domain analysis offers new insights into speech prosody.
Abstract
Prosody is usually defined in terms of the three distinct but interacting domains of pitch, intensity and duration patterning, or, more generally, as phonological and phonetic properties of 'suprasegmentals', speech segments which are larger than consonants and vowels. Rather than taking this approach, the concept of multiple time domains for prosody processing is taken up, and methods of time domain analysis are discussed: annotation mining with timing dispersion measures, time tree induction, oscillator models in phonology and phonetics, and finally the use of the Amplitude Envelope Modulation Spectrum (AEMS). While frequency demodulation (in the form of pitch tracking) is a central issue in prosodic analysis, in the present context it is amplitude envelope demodulation and frequency zones in the long time-domain spectra of the demodulated envelope which are focused. A generalised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
