Domain Adversarial Neural Networks for Dysarthric Speech Recognition

Dominika Woszczyk; Stavros Petridis; David Millard

arXiv:2010.03623·cs.SD·October 9, 2020

Domain Adversarial Neural Networks for Dysarthric Speech Recognition

Dominika Woszczyk, Stavros Petridis, David Millard

PDF

TL;DR

This paper applies domain adversarial neural networks to improve dysarthric speech recognition, achieving higher accuracy than baseline models and comparable results to speaker-adaptive methods, especially when labeled data is scarce.

Contribution

It introduces the use of DANN for speaker-independent dysarthric speech recognition and demonstrates its effectiveness over traditional models.

Findings

01

DANN achieves a 74.91% recognition rate, outperforming baseline by 12.18%.

02

DANN performs comparably to speaker-adaptive models.

03

DANN outperforms multi-task learning when labeled data is limited.

Abstract

Speech recognition systems have improved dramatically over the last few years, however, their performance is significantly degraded for the cases of accented or impaired speech. This work explores domain adversarial neural networks (DANN) for speaker-independent speech recognition on the UAS dataset of dysarthric speech. The classification task on 10 spoken digits is performed using an end-to-end CNN taking raw audio as input. The results are compared to a speaker-adaptive (SA) model as well as speaker-dependent (SD) and multi-task learning models (MTL). The experiments conducted in this paper show that DANN achieves an absolute recognition rate of 74.91% and outperforms the baseline by 12.18%. Additionally, the DANN model achieves comparable results to the SA model's recognition rate of 77.65%. We also observe that when labelled dysarthric speech data is available DANN and MTL perform…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.