Automatic Severity Classification of Dysarthric speech by using   Self-supervised Model with Multi-task Learning

Eun Jung Yeo; Kwanghee Choi; Sunhee Kim; Minhwa Chung

arXiv:2210.15387·cs.CL·May 1, 2023

Automatic Severity Classification of Dysarthric speech by using Self-supervised Model with Multi-task Learning

Eun Jung Yeo, Kwanghee Choi, Sunhee Kim, Minhwa Chung

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel self-supervised multi-task learning approach using Wav2vec 2.0 XLS-R for automatic severity classification of dysarthric speech, addressing data scarcity and improving performance over traditional methods.

Contribution

The study presents a new multi-task learning framework combining severity classification and ASR using self-supervised models, enhancing dysarthric speech assessment accuracy.

Findings

01

Model outperforms traditional acoustic feature-based classifiers by 1.25% F1-score.

02

Incorporating ASR head yields 10.61% relative improvement.

03

Multi-task learning improves latent representations and regularization.

Abstract

Automatic assessment of dysarthric speech is essential for sustained treatments and rehabilitation. However, obtaining atypical speech is challenging, often leading to data scarcity issues. To tackle the problem, we propose a novel automatic severity assessment method for dysarthric speech, using the self-supervised model in conjunction with multi-task learning. Wav2vec 2.0 XLS-R is jointly trained for two different tasks: severity classification and auxiliary automatic speech recognition (ASR). For the baseline experiments, we employ hand-crafted acoustic features and machine learning classifiers such as SVM, MLP, and XGBoost. Explored on the Korean dysarthric speech QoLT database, our model outperforms the traditional baseline methods, with a relative percentage increase of 1.25% for F1-score. In addition, the proposed model surpasses the model trained without ASR head, achieving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

juice500ml/dysarthria-mtl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders · Dysphagia Assessment and Management · Speech Recognition and Synthesis

MethodsSupport Vector Machine