DDKtor: Automatic Diadochokinetic Speech Analysis
Yael Segal, Kasia Hitczenko, Matthew Goldrick, Adam Buchwald, Angela, Roberts, Joseph Keshet

TL;DR
This paper introduces two deep neural network models that automatically segment consonants and vowels in diadochokokinetic speech, enabling efficient and accurate analysis of speech motor impairments, outperforming existing methods and matching human annotators.
Contribution
The paper presents novel deep learning models for automatic speech segmentation in DDK tasks, improving accuracy and efficiency over manual and previous automated methods.
Findings
LSTM model outperforms state-of-the-art systems
Models perform comparably to trained human annotators
Effective on both healthy and Parkinson's disease speech datasets
Abstract
Diadochokinetic speech tasks (DDK), in which participants repeatedly produce syllables, are commonly used as part of the assessment of speech motor impairments. These studies rely on manual analyses that are time-intensive, subjective, and provide only a coarse-grained picture of speech. This paper presents two deep neural network models that automatically segment consonants and vowels from unannotated, untranscribed speech. Both models work on the raw waveform and use convolutional layers for feature extraction. The first model is based on an LSTM classifier followed by fully connected layers, while the second model adds more convolutional layers followed by fully connected layers. These segmentations predicted by the models are used to obtain measures of speech rate and sound duration. Results on a young healthy individuals dataset show that our LSTM model outperforms the current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Phonetics and Phonology Research · Stuttering Research and Treatment
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
