Small Language Models for Early Detection of ADRD: Open-Source Language Models in Clinical Speech Classification
Venkatanand ram Addepalli, Praveen Rao, Erich Kummerfeld, Andrew Kiselica, Knoo Lee

TL;DR
This study shows that small language models can accurately detect dementia from speech transcripts while preserving privacy and reducing costs.
Contribution
First on-premise deployment of SLMs for dementia classification using linguistic features in a HIPAA-compliant setting.
Findings
GPT-2 achieved 78.3% accuracy in classifying dementia from speech transcripts.
SLMs provided low-latency inference times under three seconds.
SLMs offer a privacy-preserving alternative to large language models for clinical use.
Abstract
Large language models (LLMs) have shown promise in biomedical and clinical applications, including aging research, patient care, and translational geriatrics. However, their use is limited by costs, privacy risks, and HIPAA concerns. These challenges point to an alternative: Small Language Models (SLM; < 2 billion parameters) that run entirely on local hardware, reducing latency and dependence on external services. Recent open-source releases (e.g., DeepSeek R1) highlight the growing capability and efficiency of open-source models. These advances draw attention to SLMs, which are inherently efficient and easier to deploy in local clinical settings. This underscores the need to evaluate smaller, open-source language models on similar tasks. In this study, we evaluated four SLMs (DeepSeek Coder, GPT-2, Microsoft Phi-3, SmolLM2) on 398 transcripts from the Pitt Corpus, comprising dementia…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education
