CogBench: A Large Language Model Benchmark for Multilingual Speech-Based Cognitive Impairment Assessment

Rui Feng; Zhiyao Luo; Wei Wang; Yuting Song; Yong Liu; Tingting Zhu; Jianqing Li; and Xingyao Wang

arXiv:2508.03360·cs.AI·October 20, 2025

CogBench: A Large Language Model Benchmark for Multilingual Speech-Based Cognitive Impairment Assessment

Rui Feng, Zhiyao Luo, Wei Wang, Yuting Song, Yong Liu, Tingting Zhu, Jianqing Li, and Xingyao Wang

PDF

TL;DR

This paper introduces CogBench, a benchmark for evaluating large language models in multilingual speech-based cognitive impairment assessment, highlighting the importance of model adaptability across languages and clinical settings.

Contribution

It presents CogBench as the first comprehensive benchmark for cross-lingual and cross-site evaluation of LLMs in speech-based cognitive assessment, including new datasets and evaluation protocols.

Findings

01

LLMs with chain-of-thought prompting show improved adaptability.

02

Lightweight fine-tuning with LoRA enhances cross-domain generalization.

03

Conventional deep learning models perform poorly across different domains.

Abstract

Automatic assessment of cognitive impairment from spontaneous speech offers a promising, non-invasive avenue for early cognitive screening. However, current approaches often lack generalizability when deployed across different languages and clinical settings, limiting their practical utility. In this study, we propose CogBench, the first benchmark designed to evaluate the cross-lingual and cross-site generalizability of large language models (LLMs) for speech-based cognitive impairment assessment. Using a unified multimodal pipeline, we evaluate model performance on three speech datasets spanning English and Mandarin: ADReSSo, NCMMSC2021-AD, and a newly collected test set, CIR-E. Our results show that conventional deep learning models degrade substantially when transferred across domains. In contrast, LLMs equipped with chain-of-thought prompting demonstrate better adaptability, though…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.