Time-Varying Autoregressions in Speech: Detection Theory and Applications
Daniel Rudoy, Thomas F. Quatieri, and Patrick J. Wolfe

TL;DR
This paper introduces a detection theory for speech analysis using time-varying autoregressive models, enabling efficient detection of vocal tract changes and dynamic speech features across various time scales.
Contribution
It develops a general detection framework based on time-varying autoregressive models, extending classical speech analysis methods with a decision-theoretic approach and practical detection procedures.
Findings
Effective detection of formant changes within tens of milliseconds.
Identification of glottal opening and closing instants below ten milliseconds.
The proposed method demonstrates practical efficacy in real speech data.
Abstract
This article develops a general detection theory for speech analysis based on time-varying autoregressive models, which themselves generalize the classical linear predictive speech analysis framework. This theory leads to a computationally efficient decision-theoretic procedure that may be applied to detect the presence of vocal tract variation in speech waveform data. A corresponding generalized likelihood ratio test is derived and studied both empirically for short data records, using formant-like synthetic examples, and asymptotically, leading to constant false alarm rate hypothesis tests for changes in vocal tract configuration. Two in-depth case studies then serve to illustrate the practical efficacy of this procedure across different time scales of speech dynamics: first, the detection of formant changes on the scale of tens of milliseconds of data, and second, the identification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
