Multilingual analysis of intelligibility classification using English, Korean, and Tamil dysarthric speech datasets
Eun Jung Yeo, Sunhee Kim, Minhwa Chung

TL;DR
This study compares acoustic features across English, Korean, and Tamil dysarthric speech to identify language-independent and language-dependent markers for intelligibility classification, revealing different key features for each language.
Contribution
It introduces a multilingual analysis of dysarthric speech, highlighting language-specific and universal acoustic features for intelligibility assessment.
Findings
Pronunciation features are language-independent markers.
Voice quality and prosody features vary by language.
Different speech dimensions influence intelligibility classification per language.
Abstract
This paper analyzes dysarthric speech datasets from three languages with different prosodic systems: English, Korean, and Tamil. We inspect 39 acoustic measurements which reflect three speech dimensions including voice quality, pronunciation, and prosody. As multilingual analysis, examination on the mean values of acoustic measurements by intelligibility levels is conducted. Further, automatic intelligibility classification is performed to scrutinize the optimal feature set by languages. Analyses suggest pronunciation features, such as Percentage of Correct Consonants, Percentage of Correct Vowels, and Percentage of Correct Phonemes to be language-independent measurements. Voice quality and prosody features, however, generally present different aspects by languages. Experimental results additionally show that different speech dimension play a greater role for different languages:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Phonetics and Phonology Research · Speech Recognition and Synthesis
