Multilingual Cognitive Impairment Detection in the Era of Foundation Models

Damar Hoogland; Boshko Koloski; Jaya Caporusso; Tine Kolenik; Ana Zwitter Vitez; Senja Pollak; Christina Manouilidou; Matthew Purver

arXiv:2604.06758·cs.CL·April 9, 2026

Multilingual Cognitive Impairment Detection in the Era of Foundation Models

Damar Hoogland, Boshko Koloski, Jaya Caporusso, Tine Kolenik, Ana Zwitter Vitez, Senja Pollak, Christina Manouilidou, Matthew Purver

PDF

TL;DR

This study compares zero-shot large language models and supervised tabular models for multilingual cognitive impairment detection from speech transcripts, highlighting the effectiveness of structured linguistic features and fusion methods in small-data scenarios.

Contribution

It provides a comprehensive evaluation of multilingual CI detection methods, emphasizing the strengths of supervised models with linguistic features over zero-shot LLMs in low-data contexts.

Findings

01

Supervised models outperform zero-shot LLMs in multilingual CI detection.

02

Engineered linguistic features combined with embeddings improve classification accuracy.

03

Few-shot learning benefits vary across languages, depending on available labeled data.

Abstract

We evaluate cognitive impairment (CI) classification from transcripts of speech in English, Slovene, and Korean. We compare zero-shot large language models (LLMs) used as direct classifiers under three input settings -- transcript-only, linguistic-features-only, and combined -- with supervised tabular approaches trained under a leave-one-out protocol. The tabular models operate on engineered linguistic features, transcript embeddings, and early or late fusion of both modalities. Across languages, zero-shot LLMs provide competitive no-training baselines, but supervised tabular models generally perform better, particularly when engineered linguistic features are included and combined with embeddings. Few-shot experiments focusing on embeddings indicate that the value of limited supervision is language-dependent, with some languages benefiting substantially from additional labelled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.