Cross-Linguistic Persona-Driven Data Synthesis for Robust Multimodal Cognitive Decline Detection
Rui Feng, Zhiyao Luo, Liuyu Wu, Wei Wang, Yuting Song, Yong Liu, Kok Pin Ng, Jianqing Li, and Xingyao Wang

TL;DR
This paper introduces SynCog, a framework that synthesizes diverse multilingual clinical data and fine-tunes multimodal models with Chain-of-Thought reasoning, significantly improving early cognitive decline detection across languages.
Contribution
SynCog is a novel approach combining controllable zero-shot data synthesis with Chain-of-Thought fine-tuning to enhance cross-lingual robustness and interpretability in cognitive decline detection.
Findings
Achieved Macro-F1 scores of 80.67% and 78.46% on ADReSS and ADReSSo benchmarks.
Demonstrated robust cross-linguistic generalization with 48.71% Macro-F1 on Mandarin cohort.
Effectively alleviated data scarcity through synthetic data augmentation.
Abstract
Speech-based digital biomarkers represent a scalable, non-invasive frontier for the early identification of Mild Cognitive Impairment (MCI). However, the development of robust diagnostic models remains impeded by acute clinical data scarcity and a lack of interpretable reasoning. Current solutions frequently struggle with cross-lingual generalization and fail to provide the transparent rationales essential for clinical trust. To address these barriers, we introduce SynCog, a novel framework integrating controllable zero-shot multimodal data synthesis with Chain-of-Thought (CoT) deduction fine-tuning. Specifically, SynCog simulates diverse virtual subjects with varying cognitive profiles to effectively alleviate clinical data scarcity. This generative paradigm enables the rapid, zero-shot expansion of clinical corpora across diverse languages, effectively bypassing data bottlenecks in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Genomics and Rare Diseases · Artificial Intelligence in Healthcare and Education
