AS-ASR: A Lightweight Framework for Aphasia-Specific Automatic Speech Recognition

Chen Bao; Chuanbing Huo; Qinyu Chen; Chang Gao

arXiv:2506.06566·eess.AS·February 3, 2026

AS-ASR: A Lightweight Framework for Aphasia-Specific Automatic Speech Recognition

Chen Bao, Chuanbing Huo, Qinyu Chen, Chang Gao

PDF

TL;DR

This paper introduces AS-ASR, a lightweight, aphasia-specific speech recognition framework that leverages hybrid training and GPT-4-based transcript enhancement to improve recognition accuracy on aphasic speech, suitable for edge devices.

Contribution

It presents a novel hybrid training strategy and GPT-4-based transcript refinement for aphasia-specific speech recognition, optimized for low-resource edge deployment.

Findings

01

WER on aphasic speech reduced by over 30%

02

Model maintains performance on standard speech

03

Framework is scalable and efficient for real-world use

Abstract

This paper proposes AS-ASR, a lightweight aphasia-specific speech recognition framework based on Whisper-tiny, tailored for low-resource deployment on edge devices. Our approach introduces a hybrid training strategy that systematically combines standard and aphasic speech at varying ratios, enabling robust generalization, and a GPT-4-based reference enhancement method that refines noisy aphasic transcripts, improving supervision quality. We conduct extensive experiments across multiple data mixing configurations and evaluation settings. Results show that our fine-tuned model significantly outperforms the zero-shot baseline, reducing WER on aphasic speech by over 30% while preserving performance on standard speech. The proposed framework offers a scalable, efficient solution for real-world disordered speech recognition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.