AI‐Powered Speech Analysis: Automating Transcription, Embeddings, and Deep Learning for Early Alzheimer's Detection
Daniel Zhou, Zhen Qian

TL;DR
This paper presents an AI pipeline using speech analysis to detect Alzheimer's disease, showing that embedding models improve classification accuracy.
Contribution
The novel contribution is combining LLM embeddings with speech data to enhance Alzheimer's detection accuracy using machine learning models.
Findings
Alzheimer's patients show higher filler word frequency and lower vocabulary diversity in speech.
LLM embeddings significantly improve classification model performance compared to linguistic features alone.
SVM outperformed LR and RF with 84% accuracy in Alzheimer's classification.
Abstract
Spontaneous speech is a promising, non‐invasive, cost‐effective biomarker. LLM vector embeddings capture semantic and contextual patterns. This study transcribed audio, generated embeddings, and trained machine learning models to classify AD patients versus healthy controls. We used audio files from the Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) 2020 Challenge dataset, with 108 training participants (54 AD, 54 healthy) and 48 testing participants, all describing the Cookie Theft picture. We did linguistic analysis for word frequency and speech disfluency. Next, we evaluated two commercial audio‐to‐text APIs (OpenAI Whisper vs AssemblyAI) to build a more automated and scalable classification pipeline. We used two OpenAI's new embedding models to generate embedding vectors. Lastly we built three classification models: Support Vector Machine (SVC), Logistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Dementia and Cognitive Impairment Research · Mental Health via Writing
