Aphasic Speech Recognition using a Mixture of Speech Intelligibility   Experts

Matthew Perez; Zakaria Aldeneh; Emily Mower Provost

arXiv:2008.10788·eess.AS·March 21, 2023

Aphasic Speech Recognition using a Mixture of Speech Intelligibility Experts

Matthew Perez, Zakaria Aldeneh, Emily Mower Provost

PDF

TL;DR

This paper introduces a mixture of experts acoustic model for aphasic speech recognition, which dynamically adapts to varying speech intelligibility levels, significantly improving accuracy over standard models.

Contribution

The paper presents a novel severity-based mixture of experts model that explicitly incorporates speech intelligibility estimation for better aphasic speech recognition.

Findings

01

Significant reduction in phone error rates across severity stages

02

Effective use of speech intelligibility detector for expert weighting

03

Improved robustness over baseline models

Abstract

Robust speech recognition is a key prerequisite for semantic feature extraction in automatic aphasic speech analysis. However, standard one-size-fits-all automatic speech recognition models perform poorly when applied to aphasic speech. One reason for this is the wide range of speech intelligibility due to different levels of severity (i.e., higher severity lends itself to less intelligible speech). To address this, we propose a novel acoustic model based on a mixture of experts (MoE), which handles the varying intelligibility stages present in aphasic speech by explicitly defining severity-based experts. At test time, the contribution of each expert is decided by estimating speech intelligibility with a speech intelligibility detector (SID). We show that our proposed approach significantly reduces phone error rates across all severity stages in aphasic speech compared to a baseline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.