Careful Whisper -- leveraging advances in automatic speech recognition   for robust and interpretable aphasia subtype classification

Laurin Wagner; Mario Zusag; Theresa Bloder

arXiv:2308.01327·cs.SD·August 4, 2023

Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification

Laurin Wagner, Mario Zusag, Theresa Bloder

PDF

Open Access

TL;DR

This paper introduces an automated speech analysis pipeline combining advanced speech recognition and NLP techniques to accurately classify aphasia subtypes and distinguish affected speech from healthy controls, with potential for broader diagnostic applications.

Contribution

It presents a novel integrated approach leveraging CTC and encoder-decoder ASR models with NLP features for robust, interpretable aphasia classification, achieving high accuracy.

Findings

01

Achieved 90% accuracy in classifying aphasia types.

02

Human-level accuracy in distinguishing aphasic from healthy speech.

03

Pipeline adaptable to other diseases and languages.

Abstract

This paper presents a fully automated approach for identifying speech anomalies from voice recordings to aid in the assessment of speech impairments. By combining Connectionist Temporal Classification (CTC) and encoder-decoder-based automatic speech recognition models, we generate rich acoustic and clean transcripts. We then apply several natural language processing methods to extract features from these transcripts to produce prototypes of healthy speech. Basic distance measures from these prototypes serve as input features for standard machine learning classifiers, yielding human-level accuracy for the distinction between recordings of people with aphasia and a healthy control group. Furthermore, the most frequently occurring aphasia types can be distinguished with 90% accuracy. The pipeline is directly applicable to other diseases and languages, showing promise for robustly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Topic Modeling · Text Readability and Simplification