Interpretable Dysarthric Speaker Adaptation based on Optimal-Transport

Rosanna Turrisi; Leonardo Badino

arXiv:2203.07143·cs.CL·September 13, 2023

Interpretable Dysarthric Speaker Adaptation based on Optimal-Transport

Rosanna Turrisi, Leonardo Badino

PDF

TL;DR

This paper introduces an interpretable unsupervised multi-source domain adaptation method using optimal transport for dysarthric speech recognition, improving command error rates and enabling dysarthria diagnosis without extra training.

Contribution

It proposes a novel MSDA algorithm based on weighted joint optimal transport that is interpretable and capable of diagnosing dysarthria directly from speech data.

Findings

01

Achieved 16% reduction in command error rate over baseline

02

Attained 95% accuracy in dysarthria diagnosis

03

Provided a measure of similarity between speakers for diagnosis

Abstract

This work addresses the mismatch problem between the distribution of training data (source) and testing data (target), in the challenging context of dysarthric speech recognition. We focus on Speaker Adaptation (SA) in command speech recognition, where data from multiple sources (i.e., multiple speakers) are available. Specifically, we propose an unsupervised Multi-Source Domain Adaptation (MSDA) algorithm based on optimal-transport, called MSDA via Weighted Joint Optimal Transport (MSDA-WJDOT). We achieve a Command Error Rate relative reduction of 16% and 7% over the speaker-independent model and the best competitor method, respectively. The strength of the proposed approach is that, differently from any other existing SA method, it offers an interpretable model that can also be exploited, in this context, to diagnose dysarthria without any specific training. Indeed, it provides a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.