CLAP-Based Automatic Word Naming Recognition in Post-Stroke Aphasia

Yacouba Kaloga; Marina Laganaro; Ina Kodrasi

arXiv:2602.14584·eess.AS·February 17, 2026

CLAP-Based Automatic Word Naming Recognition in Post-Stroke Aphasia

Yacouba Kaloga, Marina Laganaro, Ina Kodrasi

PDF

Open Access

TL;DR

This paper introduces a CLAP-based method for automatic word recognition in post-stroke aphasia patients, effectively handling disfluencies and mispronunciations to improve assessment accuracy.

Contribution

It presents a novel CLAP-based approach that models word recognition as an audio-text matching task, enhancing recognition in challenging speech samples.

Findings

01

Achieves up to 90% accuracy on patient datasets

02

Outperforms existing classification and ASR baselines

03

Effective in recognizing disfluent and mispronounced words

Abstract

Conventional automatic word-naming recognition systems struggle to recognize words from post-stroke patients with aphasia because of disfluencies and mispronunciations, limiting reliable automated assessment in this population. In this paper, we propose a Contrastive Language-Audio Pretraining (CLAP) based approach for automatic word-naming recognition to address this challenge by leveraging text-audio alignment. Our approach treats word-naming recognition as an audio-text matching problem, projecting speech signals and textual prompts into a shared embedding space to identify intended words even in challenging recordings. Evaluated on two speech datasets of French post-stroke patients with aphasia, our approach achieves up to 90% accuracy, outperforming existing classification-based and automatic speech recognition-based baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Speech Recognition and Synthesis · Voice and Speech Disorders