Small Language Models for Early Detection of ADRD: Open-Source Language Models in Clinical Speech Classification

Venkatanand ram Addepalli; Praveen Rao; Erich Kummerfeld; Andrew Kiselica; Knoo Lee

PMC · DOI:10.1093/geroni/igaf122.4317·December 31, 2025

Small Language Models for Early Detection of ADRD: Open-Source Language Models in Clinical Speech Classification

Venkatanand ram Addepalli, Praveen Rao, Erich Kummerfeld, Andrew Kiselica, Knoo Lee

PDF

Open Access

TL;DR

This study shows that small language models can accurately detect dementia from speech transcripts while preserving privacy and reducing costs.

Contribution

First on-premise deployment of SLMs for dementia classification using linguistic features in a HIPAA-compliant setting.

Findings

01

GPT-2 achieved 78.3% accuracy in classifying dementia from speech transcripts.

02

SLMs provided low-latency inference times under three seconds.

03

SLMs offer a privacy-preserving alternative to large language models for clinical use.

Abstract

Large language models (LLMs) have shown promise in biomedical and clinical applications, including aging research, patient care, and translational geriatrics. However, their use is limited by costs, privacy risks, and HIPAA concerns. These challenges point to an alternative: Small Language Models (SLM; < 2 billion parameters) that run entirely on local hardware, reducing latency and dependence on external services. Recent open-source releases (e.g., DeepSeek R1) highlight the growing capability and efficiency of open-source models. These advances draw attention to SLMs, which are inherently efficient and easier to deploy in local clinical settings. This underscores the need to evaluate smaller, open-source language models on similar tasks. In this study, we evaluated four SLMs (DeepSeek Coder, GPT-2, Microsoft Phi-3, SmolLM2) on 398 transcripts from the Pitt Corpus, comprising dementia…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases1

dementia

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMental Health via Writing · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education