Automatic Severity Classification of Dysarthric speech by using Self-supervised Model with Multi-task Learning
Eun Jung Yeo, Kwanghee Choi, Sunhee Kim, Minhwa Chung

TL;DR
This paper introduces a novel self-supervised multi-task learning approach using Wav2vec 2.0 XLS-R for automatic severity classification of dysarthric speech, addressing data scarcity and improving performance over traditional methods.
Contribution
The study presents a new multi-task learning framework combining severity classification and ASR using self-supervised models, enhancing dysarthric speech assessment accuracy.
Findings
Model outperforms traditional acoustic feature-based classifiers by 1.25% F1-score.
Incorporating ASR head yields 10.61% relative improvement.
Multi-task learning improves latent representations and regularization.
Abstract
Automatic assessment of dysarthric speech is essential for sustained treatments and rehabilitation. However, obtaining atypical speech is challenging, often leading to data scarcity issues. To tackle the problem, we propose a novel automatic severity assessment method for dysarthric speech, using the self-supervised model in conjunction with multi-task learning. Wav2vec 2.0 XLS-R is jointly trained for two different tasks: severity classification and auxiliary automatic speech recognition (ASR). For the baseline experiments, we employ hand-crafted acoustic features and machine learning classifiers such as SVM, MLP, and XGBoost. Explored on the Korean dysarthric speech QoLT database, our model outperforms the traditional baseline methods, with a relative percentage increase of 1.25% for F1-score. In addition, the proposed model surpasses the model trained without ASR head, achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Dysphagia Assessment and Management · Speech Recognition and Synthesis
MethodsSupport Vector Machine
