An End-To-End Stuttering Detection Method Based On Conformer And BILSTM
Xiaokang Liu, Changqing Xu, Yudong Yang, Lan Wang, Nan Yan

TL;DR
This paper introduces a novel end-to-end stuttering detection model combining Conformer and BILSTM, utilizing multi-task learning to improve accuracy in identifying stuttering types and severity, outperforming existing methods.
Contribution
The paper proposes a new multi-task learning approach with Conformer and BILSTM for more accurate stuttering detection, achieving state-of-the-art results.
Findings
Model outperforms current state-of-the-art methods.
Achieved 24.8% improvement in F1 score in SLT 2024 Challenge.
Further improved F1 score by 39.8% over baseline.
Abstract
Stuttering is a neurodevelopmental speech disorder characterized by common speech symptoms such as pauses, exclamations, repetition, and prolongation. Speech-language pathologists typically assess the type and severity of stuttering by observing these symptoms. Many effective end-to-end methods exist for stuttering detection, but a commonly overlooked challenge is the uncertain relationship between tasks involved in this process. Using a suitable multi-task strategy could improve stuttering detection performance. This paper presents a novel stuttering event detection model designed to help speech-language pathologists assess both the type and severity of stuttering. First, the Conformer model extracts acoustic features from stuttered speech, followed by a Long Short-Term Memory (LSTM) network to capture contextual information. Finally, we explore multi-task learning for stuttering and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStuttering Research and Treatment
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
