The Far Side of Failure: Investigating the Impact of Speech Recognition   Errors on Subsequent Dementia Classification

Changye Li; Trevor Cohen; and Serguei Pakhomov

arXiv:2211.07430·eess.AS·November 15, 2022·1 cites

The Far Side of Failure: Investigating the Impact of Speech Recognition Errors on Subsequent Dementia Classification

Changye Li, Trevor Cohen, and Serguei Pakhomov

PDF

Open Access 1 Repo

TL;DR

This study investigates how speech recognition errors affect dementia classification, revealing that higher error rates in ASR can sometimes improve classification accuracy in clinical speech analysis.

Contribution

It demonstrates that imperfect ASR transcripts can outperform verbatim transcripts for dementia detection, challenging assumptions about the necessity of high transcription accuracy.

Findings

01

Higher error rate ASR transcripts can improve classification accuracy.

02

Imperfect transcripts may contain more relevant linguistic cues.

03

ASR performance varies with clinical speech complexity.

Abstract

Linguistic anomalies detectable in spontaneous speech have shown promise for various clinical applications including screening for dementia and other forms of cognitive impairment. The feasibility of deploying automated tools that can classify language samples obtained from speech in large-scale clinical settings depends on the ability to capture and automatically transcribe the speech for subsequent analysis. However, the impressive performance of self-supervised learning (SSL) automatic speech recognition (ASR) models with curated speech data is not apparent with challenging speech samples from clinical settings. One of the key questions for successfully applying ASR models for clinical applications is whether imperfect transcripts they generate provide sufficient information for downstream tasks to operate at an acceptable level of accuracy. In this study, we examine the relationship…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

linguisticanomalies/paradox-asr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Speech and dialogue systems