Seamless Dysfluent Speech Text Alignment for Disordered Speech Analysis

Zongli Ye; Jiachen Lian; Xuanru Zhou; Jinming Zhang; Haodong Li; Shuhe Li; Chenxu Guo; Anaisha Das; Peter Park; Zoe Ezzes; Jet Vonk; Brittany Morin; Rian Bogley; Lisa Wauters; Zachary Miller; Maria Gorno-Tempini; and Gopala Anumanchipalli

arXiv:2506.12073·eess.AS·June 17, 2025

Seamless Dysfluent Speech Text Alignment for Disordered Speech Analysis

Zongli Ye, Jiachen Lian, Xuanru Zhou, Jinming Zhang, Haodong Li, Shuhe Li, Chenxu Guo, Anaisha Das, Peter Park, Zoe Ezzes, Jet Vonk, Brittany Morin, Rian Bogley, Lisa Wauters, Zachary Miller, Maria Gorno-Tempini, and Gopala Anumanchipalli

PDF

Open Access

TL;DR

This paper introduces Neural LCS, a novel phoneme-level alignment method that improves the accuracy of aligning dysfluent speech with intended text, aiding diagnosis of speech disorders.

Contribution

Neural LCS is a new approach that effectively models phoneme similarities and handles partial and context-aware alignment for dysfluent speech.

Findings

01

Neural LCS outperforms existing models in alignment accuracy.

02

It demonstrates robustness on both simulated and real PPA data.

03

Significantly improves dysfluent speech segmentation.

Abstract

Accurate alignment of dysfluent speech with intended text is crucial for automating the diagnosis of neurodegenerative speech disorders. Traditional methods often fail to model phoneme similarities effectively, limiting their performance. In this work, we propose Neural LCS, a novel approach for dysfluent text-text and speech-text alignment. Neural LCS addresses key challenges, including partial alignment and context-aware similarity mapping, by leveraging robust phoneme-level modeling. We evaluate our method on a large-scale simulated dataset, generated using advanced data simulation techniques, and real PPA data. Neural LCS significantly outperforms state-of-the-art models in both alignment accuracy and dysfluent speech segmentation. Our results demonstrate the potential of Neural LCS to enhance automated systems for diagnosing and analyzing speech disorders, offering a more accurate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders