Benchmarking Foundation Models for Alzheimer’s Disease and Related Dementia Detection from Spontaneous Speech
Jingyu Li, Lingchao Mao, Hairong Wang, Zhendong Wang, Xi Mao, Xuelei Ni

TL;DR
This paper explores using speech-based AI models to detect early signs of Alzheimer's and related dementias, showing promising results for non-invasive diagnosis.
Contribution
The study introduces a benchmarking framework for foundation models in ADRD detection using a large clinical dataset.
Findings
Whisper-medium achieved the highest accuracy (0.731) and AUC (0.802) among speech-based models.
ASR models outperformed other models in classifying cognitive decline stages.
Adding prosodic features improved performance in text-based approaches.
Abstract
Alzheimer’s disease and related dementias (ADRD) are progressive neurodegenerative conditions where early detection is critical for timely intervention and care planning. Acoustic biomarkers—such as changes in prosody, fluency, and pause patterns—can be extracted from spontaneous speech and offer a non-invasive avenue for early diagnosis. Foundational speech and language models, which are pre-trained deep learning models, can generate high-dimensional embeddings that capture rich contextual and acoustic information from raw audio or text. Using data from the PREPARE Phase 2 Challenge, which includes recordings from over 1,600 individuals, we examined the potential of foundation models for ADRD detection. Specifically, we benchmarked a range of open-source speech and language models on their ability to classify participants into different stages of cognitive decline. Among speech-based…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Machine Learning in Healthcare · Emotion and Mood Recognition
