AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection
Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui, Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming, Li

TL;DR
This paper introduces AS-70, the first large publicly available Mandarin stuttered speech dataset, and demonstrates its usefulness in improving speech recognition and stuttering event detection models.
Contribution
It provides the first large Mandarin stuttered speech dataset and baseline systems, enabling better ASR and stuttering detection for atypical speech.
Findings
Significant improvements in ASR accuracy with the dataset
Enhanced stuttering event detection performance
Increased inclusivity of speech models for atypical speech
Abstract
The rapid advancements in speech technologies over the past two decades have led to human-level performance in tasks like automatic speech recognition (ASR) for fluent speech. However, the efficacy of these models diminishes when applied to atypical speech, such as stuttering. This paper introduces AS-70, the first publicly available Mandarin stuttered speech dataset, which stands out as the largest dataset in its category. Encompassing conversational and voice command reading speech, AS-70 includes verbatim manual transcription, rendering it suitable for various speech-related tasks. Furthermore, baseline systems are established, and experimental results are presented for ASR and stuttering event detection (SED) tasks. By incorporating this dataset into the model fine-tuning, significant improvements in the state-of-the-art ASR models, e.g., Whisper and Hubert, are observed, enhancing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Phonetics and Phonology Research · Stuttering Research and Treatment
