# Personalizing ASR for Dysarthric and Accented Speech with Limited Data

**Authors:** Joel Shor, Dotan Emanuel, Oran Lang, Omry Tuval, Michael Brenner,, Julie Cattiau, Fernando Vieira, Maeve McNally, Taylor Charbonneau, Melissa, Nollstadt, Avinatan Hassidim, Yossi Matias

arXiv: 1907.13511 · 2021-05-11

## TL;DR

This paper introduces finetuning techniques to personalize automatic speech recognition systems for dysarthric and accented speech, achieving significant improvements with minimal data, and advancing the development of ASR for underrepresented speech groups.

## Contribution

It presents novel finetuning methods that improve ASR accuracy for dysarthric and accented speech using limited data, with insights on layer-specific training.

## Key findings

- 62% and 35% relative WER reduction for ALS and accented speech
- Achieves 10% and 20% absolute WER for mild and serious dysarthria
- 71% of improvement from only 5 minutes of training data

## Abstract

Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from 'typical' speech, which means that underrepresented groups don't experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech: speech from people with amyotrophic lateral sclerosis (ALS) and accented speech. We train personalized models that achieve 62% and 35% relative WER improvement on these two groups, bringing the absolute WER for ALS speakers, on a test set of message bank phrases, down to 10% for mild dysarthria and 20% for more serious dysarthria. We show that 71% of the improvement comes from only 5 minutes of training data. Finetuning a particular subset of layers (with many fewer parameters) often gives better results than finetuning the entire model. This is the first step towards building state of the art ASR models for dysarthric speech.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.13511/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1907.13511/full.md

---
Source: https://tomesphere.com/paper/1907.13511