Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition
Satwinder Singh, Qianli Wang, Zihan Zhong, Clarion Mendes, Mark, Hasegawa-Johnson, Waleed Abdulla, Seyed Reza Shahamiri

TL;DR
This paper introduces a robust speaker-independent dysarthric speech recognition system that generalizes across different speakers and etiologies, achieving low error rates on multiple datasets using the Whisper model.
Contribution
Developed a novel speaker-independent dysarthric speech recognition system that performs well across different speakers and etiologies, leveraging the Whisper model for improved robustness.
Findings
Achieved CER of 6.99% and WER of 10.71% on SAP-1005 dataset.
Achieved CER of 25.08% and WER of 39.56% on TORGO dataset.
Demonstrated cross-etiology generalization across PD, CP, and ALS.
Abstract
In this paper, we present a speaker-independent dysarthric speech recognition system, with a focus on evaluating the recently released Speech Accessibility Project (SAP-1005) dataset, which includes speech data from individuals with Parkinson's disease (PD). Despite the growing body of research in dysarthric speech recognition, many existing systems are speaker-dependent and adaptive, limiting their generalizability across different speakers and etiologies. Our primary objective is to develop a robust speaker-independent model capable of accurately recognizing dysarthric speech, irrespective of the speaker. Additionally, as a secondary objective, we aim to test the cross-etiology performance of our model by evaluating it on the TORGO dataset, which contains speech samples from individuals with cerebral palsy (CP) and amyotrophic lateral sclerosis (ALS). By leveraging the Whisper model,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Speech Recognition and Synthesis · Phonetics and Phonology Research
MethodsFocus
