TL;DR
This paper introduces a novel method for measuring sonority in speech by analyzing vocal-tract, source, and suprasegmental features, improving phoneme recognition and sonorant classification accuracy.
Contribution
It presents a new multi-faceted sonority measurement approach combining formant, excitation, and periodicity features, outperforming traditional MFCC-based methods.
Findings
Enhanced discrimination among sonorant classes.
Improved phoneme recognition accuracy.
Effective use of combined speech features.
Abstract
Sonorant sounds are characterized by regions with prominent formant structure, high energy and high degree of periodicity. In this work, the vocal-tract system, excitation source and suprasegmental features derived from the speech signal are analyzed to measure the sonority information present in each of them. Vocal-tract system information is extracted from the Hilbert envelope of numerator of group delay function. It is derived from zero time windowed speech signal that provides better resolution of the formants. A five-dimensional feature set is computed from the estimated formants to measure the prominence of the spectral peaks. A feature representing strength of excitation is derived from the Hilbert envelope of linear prediction residual, which represents the source information. Correlation of speech over ten consecutive pitch periods is used as the suprasegmental feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
