The Far Side of Failure: Investigating the Impact of Speech Recognition Errors on Subsequent Dementia Classification
Changye Li, Trevor Cohen, and Serguei Pakhomov

TL;DR
This study investigates how speech recognition errors affect dementia classification, revealing that higher error rates in ASR can sometimes improve classification accuracy in clinical speech analysis.
Contribution
It demonstrates that imperfect ASR transcripts can outperform verbatim transcripts for dementia detection, challenging assumptions about the necessity of high transcription accuracy.
Findings
Higher error rate ASR transcripts can improve classification accuracy.
Imperfect transcripts may contain more relevant linguistic cues.
ASR performance varies with clinical speech complexity.
Abstract
Linguistic anomalies detectable in spontaneous speech have shown promise for various clinical applications including screening for dementia and other forms of cognitive impairment. The feasibility of deploying automated tools that can classify language samples obtained from speech in large-scale clinical settings depends on the ability to capture and automatically transcribe the speech for subsequent analysis. However, the impressive performance of self-supervised learning (SSL) automatic speech recognition (ASR) models with curated speech data is not apparent with challenging speech samples from clinical settings. One of the key questions for successfully applying ASR models for clinical applications is whether imperfect transcripts they generate provide sufficient information for downstream tasks to operate at an acceptable level of accuracy. In this study, we examine the relationship…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Speech and dialogue systems
