TL;DR
This paper introduces a novel single-stage method called TVQCP for more accurate formant tracking in speech signals by combining quasi-closed-phase analysis, sparsity optimization, and long-term continuity constraints.
Contribution
The proposed TVQCP method unifies formant estimation and tracking into one process, improving accuracy over traditional two-stage approaches.
Findings
TVQCP outperforms Wavesurfer and Praat in accuracy.
TVQCP surpasses KARMA and DeepFormants in formant tracking.
Method shows robustness on synthetic and natural speech signals.
Abstract
In this paper, we propose a new method for the accurate estimation and tracking of formants in speech signals using time-varying quasi-closed-phase (TVQCP) analysis. Conventional formant tracking methods typically adopt a two-stage estimate-and-track strategy wherein an initial set of formant candidates are estimated using short-time analysis (e.g., 10--50 ms), followed by a tracking stage based on dynamic programming or a linear state-space model. One of the main disadvantages of these approaches is that the tracking stage, however good it may be, cannot improve upon the formant estimation accuracy of the first stage. The proposed TVQCP method provides a single-stage formant tracking that combines the estimation and tracking stages into one. TVQCP analysis combines three approaches to improve formant estimation and tracking: (1) it uses temporally weighted quasi-closed-phase analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
