Significance of the levels of spectral valleys with application to front/back distinction of vowel sounds
T. V. Ananthapadmanabha, A. G. Ramakrishnan, and Shubham Sharma

TL;DR
This paper introduces a spectral valley level-based feature for front/back vowel classification, achieving high accuracy and robustness across multiple databases and noise conditions without explicit formant knowledge.
Contribution
It proposes a novel spectral valley level feature that simplifies vowel classification and demonstrates its effectiveness and robustness compared to traditional methods.
Findings
Achieves about 95% accuracy in front/back vowel classification.
The spectral valley feature is robust to noise and does not require explicit formant knowledge.
Comparable performance to neural network classifiers using MFCC features.
Abstract
An objective critical distance (OCD) has been defined as that spacing between adjacent formants, when the level of the valley between them reaches the mean spectral level. The measured OCD lies in the same range (viz., 3-3.5 bark) as the critical distance determined by subjective experiments for similar experimental conditions. The level of spectral valley serves a purpose similar to that of the spacing between the formants with an added advantage that it can be measured from the spectral envelope without an explicit knowledge of formant frequencies. Based on the relative spacing of formant frequencies, the level of the spectral valley, VI (between F1 and F2) is much higher than the level of VII (spectral valley between F2 and F3) for back vowels and vice-versa for front vowels. Classification of vowels into front/back distinction with the difference (VI-VII) as an acoustic feature,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
