Analysis and Evaluation of Synthetic Data Generation in Speech Dysfluency Detection

Jinming Zhang; Xuanru Zhou; Jiachen Lian; Shuhe Li; William Li; Zoe Ezzes; Rian Bogley; Lisa Wauters; Zachary Miller; Jet Vonk; Brittany Morin; Maria Gorno-Tempini; Gopala Anumanchipalli

arXiv:2505.22029·eess.AS·June 24, 2025

Analysis and Evaluation of Synthetic Data Generation in Speech Dysfluency Detection

Jinming Zhang, Xuanru Zhou, Jiachen Lian, Shuhe Li, William Li, Zoe Ezzes, Rian Bogley, Lisa Wauters, Zachary Miller, Jet Vonk, Brittany Morin, Maria Gorno-Tempini, Gopala Anumanchipalli

PDF

1 Repo

TL;DR

This paper introduces LLM-Dys, a comprehensive synthetic speech dataset with enhanced dysfluency simulation, improving dysfluency detection accuracy and addressing data scarcity issues in clinical speech analysis.

Contribution

The paper presents LLM-Dys, a large-scale dysfluency speech corpus with LLM-based simulation, and an improved detection framework achieving state-of-the-art results.

Findings

01

State-of-the-art dysfluency detection performance achieved

02

LLM-Dys dataset covers 11 dysfluency categories

03

Open-source release of data, models, and code

Abstract

Speech dysfluency detection is crucial for clinical diagnosis and language assessment, but existing methods are limited by the scarcity of high-quality annotated data. Although recent advances in TTS model have enabled synthetic dysfluency generation, existing synthetic datasets suffer from unnatural prosody and limited contextual diversity. To address these limitations, we propose LLM-Dys -- the most comprehensive dysfluent speech corpus with LLM-enhanced dysfluency simulation. This dataset captures 11 dysfluency categories spanning both word and phoneme levels. Building upon this resource, we improve an end-to-end dysfluency detection framework. Experimental validation demonstrates state-of-the-art performance. All data, models, and code are open-sourced at https://github.com/Berkeley-Speech-Group/LLM-Dys.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

berkeley-speech-group/llm-dys
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.