VoxCog: Towards End-to-End Multilingual Cognitive Impairment Classification through Dialectal Knowledge
Tiantian Feng, Anfeng Xu, Jinkook Lee, Shrikanth Narayanan

TL;DR
This paper introduces VoxCog, an end-to-end multilingual speech model that leverages dialect recognition to improve classification of cognitive impairments like Alzheimer's and MCI, outperforming previous multimodal approaches.
Contribution
VoxCog is the first to integrate dialectal knowledge into speech models for cognitive impairment detection, enhancing accuracy without relying on multimodal data.
Findings
Achieved 87.5% accuracy on ADReSS 2020 dataset.
Outperformed existing multimodal and LLM-based methods.
Demonstrated consistent performance improvements across multiple datasets.
Abstract
In this work, we present a novel perspective on cognitive impairment classification from speech by integrating speech foundation models that explicitly recognize speech dialects. Our motivation is based on the observation that individuals with Alzheimer's Disease (AD) or mild cognitive impairment (MCI) often produce measurable speech characteristics, such as slower articulation rate and lengthened sounds, in a manner similar to dialectal phonetic variations seen in speech. Building on this idea, we introduce VoxCog, an end-to-end framework that uses pre-trained dialect models to detect AD or MCI without relying on additional modalities such as text or images. Through experiments on multiple multilingual datasets for AD and MCI detection, we demonstrate that model initialization with a dialect classifier on top of speech foundation models consistently improves the predictive performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Speech Recognition and Synthesis · Neurobiology of Language and Bilingualism
