Towards Temporally Explainable Dysarthric Speech Clarity Assessment

Seohyun Park; Chitralekha Gupta; Michelle Kah Yian Kwan; Xinhui Fung; Alexander Wenjun Yip; Suranga Nanayakkara

arXiv:2506.00454·eess.AS·June 3, 2025

Towards Temporally Explainable Dysarthric Speech Clarity Assessment

Seohyun Park, Chitralekha Gupta, Michelle Kah Yian Kwan, Xinhui Fung, Alexander Wenjun Yip, Suranga Nanayakkara

PDF

Open Access 1 Repo

TL;DR

This paper proposes a three-stage, explainable framework for assessing dysarthric speech clarity, including scoring, localization, and classification, using pretrained ASR models to aid speech therapy.

Contribution

It introduces a novel, temporally explainable assessment framework for dysarthric speech that leverages pretrained ASR models and provides actionable feedback.

Findings

01

Pretrained ASR models can effectively evaluate dysarthric speech clarity.

02

The framework offers clinically relevant insights for speech therapy.

03

Automated mispronunciation localization and classification are feasible.

Abstract

Dysarthria, a motor speech disorder, affects intelligibility and requires targeted interventions for effective communication. In this work, we investigate automated mispronunciation feedback by collecting a dysarthric speech dataset from six speakers reading two passages, annotated by a speech therapist with temporal markers and mispronunciation descriptions. We design a three-stage framework for explainable mispronunciation evaluation: (1) overall clarity scoring, (2) mispronunciation localization, and (3) mispronunciation type classification. We systematically analyze pretrained Automatic Speech Recognition (ASR) models in each stage, assessing their effectiveness in dysarthric speech evaluation (Code available at: https://github.com/augmented-human-lab/interspeech25_speechtherapy, Supplementary webpage: https://apps.ahlab.org/interspeech25_speechtherapy/). Our findings offer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

augmented-human-lab/interspeech25_speechtherapy
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders · Speech Recognition and Synthesis · Phonetics and Phonology Research