Optimal Transport-based Adaptation in Dysarthric Speech Tasks

Rosanna Turrisi; Leonardo Badino

arXiv:2104.02535·cs.SD·March 15, 2022

Optimal Transport-based Adaptation in Dysarthric Speech Tasks

Rosanna Turrisi, Leonardo Badino

PDF

Open Access

TL;DR

This paper introduces an optimal transport-based multi-source domain adaptation method for dysarthric speech, significantly improving detection accuracy, command recognition, and dysarthria diagnosis by leveraging speaker similarity measures.

Contribution

It proposes MSDA-WDJOT, a novel optimal transport approach for dysarthric speech adaptation, outperforming existing models in detection, recognition, and diagnosis tasks.

Findings

01

0.9% improvement in dysarthria detection accuracy

02

16% reduction in command error rate

03

95% accuracy in dysarthria diagnosis

Abstract

In many real-world applications, the mismatch between distributions of training data (source) and test data (target) significantly degrades the performance of machine learning algorithms. In speech data, causes of this mismatch include different acoustic environments or speaker characteristics. In this paper, we address this issue in the challenging context of dysarthric speech, by multi-source domain/speaker adaptation (MSDA/MSSA). Specifically, we propose the use of an optimal-transport based approach, called MSDA via Weighted Joint Optimal Transport (MSDA-WDJOT). We confront the mismatch problem in dysarthria detection for which the proposed approach outperforms both the Baseline and the state-of-the-art MSDA models, improving the detection accuracy of 0.9% over the best competitor method. We then employ MSDA-WJDOT for dysarthric speaker adaptation in command speech recognition. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Voice and Speech Disorders