Linguistic Indicators of Early Cognitive Decline in the DementiaBank Pitt Corpus: A Statistical and Machine Learning Study
Artsvik Avetisyan, Sachin Kumar

TL;DR
This study demonstrates that linguistic features derived from spontaneous speech can reliably indicate early cognitive decline, using interpretable machine learning models validated by statistical tests on the DementiaBank Pitt Corpus.
Contribution
It introduces a comprehensive analysis of linguistic markers with interpretable models, emphasizing the robustness of syntactic and grammatical features for early dementia detection.
Findings
Syntactic and grammatical features remain discriminative without lexical content.
Subject-level evaluation provides consistent results across models.
Significant differences found in word usage, sentence structure, and discourse coherence.
Abstract
Background: Subtle changes in spontaneous language production are among the earliest indicators of cognitive decline. Identifying linguistically interpretable markers of dementia can support transparent and clinically grounded screening approaches. Methods: This study analyzes spontaneous speech transcripts from the DementiaBank Pitt Corpus using three linguistic representations: raw cleaned text, a part-of-speech (POS)-enhanced representation combining lexical and grammatical information, and a POS-only syntactic representation. Logistic regression and random forest models were evaluated under two protocols: transcript-level train-test splits and subject-level five-fold cross-validation to prevent speaker overlap. Model interpretability was examined using global feature importance, and statistical validation was conducted using Mann-Whitney U tests with Cliff's delta effect sizes.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDementia and Cognitive Impairment Research · Neurobiology of Language and Bilingualism · Mental Health via Writing
