National Institute on Aging PREPARE Challenge: Early Detection of Cognitive Impairment Using Speech -- The SpeechCARE Solution

Maryam Zolnoori; Hossein Azadmaleki; Yasaman Haghbin; Ali Zolnour; Mohammad Javad Momeni Nezhad; Sina Rashidi; Mehdi Naserian; Elyas Esmaeili; Sepehr Karimi Arpanahi

arXiv:2511.08132·cs.AI·November 14, 2025

National Institute on Aging PREPARE Challenge: Early Detection of Cognitive Impairment Using Speech -- The SpeechCARE Solution

Maryam Zolnoori, Hossein Azadmaleki, Yasaman Haghbin, Ali Zolnour, Mohammad Javad Momeni Nezhad, Sina Rashidi, Mehdi Naserian, Elyas Esmaeili, Sepehr Karimi Arpanahi

PDF

Open Access

TL;DR

This paper introduces SpeechCARE, a multimodal speech processing pipeline using transformer models for early detection of cognitive impairment, achieving high accuracy and robustness across diverse populations.

Contribution

SpeechCARE is a novel multimodal, transformer-based speech assessment system that improves early detection of cognitive decline with enhanced explainability and robustness.

Findings

01

Achieved AUC=0.88 and F1=0.72 for classifying cognitive health states.

02

Demonstrated robustness across diverse demographic groups.

03

Minimal bias observed, with targeted mitigation techniques.

Abstract

Alzheimer's disease and related dementias (ADRD) affect one in five adults over 60, yet more than half of individuals with cognitive decline remain undiagnosed. Speech-based assessments show promise for early detection, as phonetic motor planning deficits alter acoustic features (e.g., pitch, tone), while memory and language impairments lead to syntactic and semantic errors. However, conventional speech-processing pipelines with hand-crafted features or general-purpose audio classifiers often exhibit limited performance and generalizability. To address these limitations, we introduce SpeechCARE, a multimodal speech processing pipeline that leverages pretrained, multilingual acoustic and linguistic transformer models to capture subtle speech-related cues associated with cognitive impairment. Inspired by the Mixture of Experts (MoE) paradigm, SpeechCARE employs a dynamic fusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Emotion and Mood Recognition · Machine Learning in Healthcare