VOP Detection for Read and Conversation Speech using CWT Coefficients and Phone Boundaries
Kumud Tripathi, K. Sreenivasa Rao

TL;DR
This paper introduces a two-stage method combining CWT coefficients and phone boundary detection to accurately identify vowel onset points in speech, improving precision over existing methods for speech segmentation tasks.
Contribution
A novel two-stage approach that integrates continuous wavelet transform and phone boundary detection for more accurate VOP detection in speech signals.
Findings
Outperforms existing VOP detection methods in accuracy.
Effective on both TIMIT and Bengali speech corpora.
Reduces VOP detection deviation below 40 ms.
Abstract
In this paper, we propose a novel approach for accurate detection of the vowel onset points (VOPs). VOP is the instant at which the vowel begins in the speech signal. Precise identification of VOPs is important for various speech applications such as speech segmentation and speech rate modification. The existing methods detect the majority of VOPs within 40 ms deviation, and it may not be appropriate for the above speech applications. To address this issue, we proposed a two-stage approach for accurate detection of VOPs. At the first stage, VOPs are detected using continuous wavelet transform coefficients, and the position of the detected VOPs are corrected using the phone boundaries in the second stage. The phone boundaries are detected by the spectral transition measure method. Experiments are done using TIMIT and Bengali speech corpora. Performance of the proposed approach is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
