# Articulatory and bottleneck features for speaker-independent ASR of   dysarthric speech

**Authors:** Emre Y{\i}lmaz, Vikramjit Mitra, Ganesh Sivaraman, Horacio, Franco

arXiv: 1905.06533 · 2019-05-22

## TL;DR

This paper evaluates the effectiveness of articulatory and bottleneck features in speaker-independent automatic speech recognition systems tailored for dysarthric speech, demonstrating significant improvements across different datasets.

## Contribution

It introduces a comparison of articulatory and bottleneck features with neural network models for speaker-independent dysarthric speech recognition, highlighting their robustness and performance gains.

## Key findings

- Significant ASR performance improvements on dysarthric speech datasets.
- Speaker-independent models outperform previous approaches.
- Remaining gap between dysarthric and normal speech recognition persists.

## Abstract

The rapid population aging has stimulated the development of assistive devices that provide personalized medical support to the needies suffering from various etiologies. One prominent clinical application is a computer-assisted speech training system which enables personalized speech therapy to patients impaired by communicative disorders in the patient's home environment. Such a system relies on the robust automatic speech recognition (ASR) technology to be able to provide accurate articulation feedback. With the long-term aim of developing off-the-shelf ASR systems that can be incorporated in clinical context without prior speaker information, we compare the ASR performance of speaker-independent bottleneck and articulatory features on dysarthric speech used in conjunction with dedicated neural network-based acoustic models that have been shown to be robust against spectrotemporal deviations. We report ASR performance of these systems on two dysarthric speech datasets of different characteristics to quantify the achieved performance gains. Despite the remaining performance gap between the dysarthric and normal speech, significant improvements have been reported on both datasets using speaker-independent ASR architectures.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.06533/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1905.06533/full.md

## References

65 references — full list in the complete paper: https://tomesphere.com/paper/1905.06533/full.md

---
Source: https://tomesphere.com/paper/1905.06533