Towards a Single ASR Model That Generalizes to Disordered Speech

Jimmy Tobin; Katrin Tomanek; Subhashini Venugopalan

arXiv:2412.19315·eess.AS·December 22, 2025·ICASSP

Towards a Single ASR Model That Generalizes to Disordered Speech

Jimmy Tobin, Katrin Tomanek, Subhashini Venugopalan

PDF

Open Access

TL;DR

Integrating a small amount of disordered speech data into a state-of-the-art ASR model significantly improves recognition accuracy for disordered speech without harming standard benchmarks, advancing accessible speech technology.

Contribution

This work demonstrates that a small dataset of disordered speech can substantially enhance ASR performance on disordered speech, bridging the gap with personalized models.

Findings

01

33% improvement on prompted disordered speech

02

26% improvement on spontaneous disordered speech

03

No significant decline on standard benchmarks

Abstract

This study investigates the impact of integrating a dataset of disordered speech recordings ( $\sim$ 1,000 hours) into the fine-tuning of a near state-of-the-art ASR baseline system. Contrary to what one might expect, despite the data being less than 1% of the training data of the ASR system, we find a considerable improvement in disordered speech recognition accuracy. Specifically, we observe a 33% improvement on prompted speech, and a 26% improvement on a newly gathered spontaneous, conversational dataset of disordered speech. Importantly, there is no significant performance decline on standard speech recognition benchmarks. Further, we observe that the proposed tuning strategy helps close the gap between the baseline system and personalized models by 64% highlighting the significant progress as well as the room for improvement. Given the substantial benefits of our findings, this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhonetics and Phonology Research · Speech Recognition and Synthesis