# Protein-based Diagnosis and Analysis of Co-pathologies Across Neurodegenerative Diseases: Large-Scale AI-Boosted CSF and Plasma Classification

**Authors:** Ying Xu, Daniel Western, Gyujin Heo, Kwangsik Nho, Yen-Ning Huang, Shiwei Liu, Hamilton Se-Hwee Oh, Yike Chen, Jigyasha Timsina, Menghan Liu, Yinxu Tang, Katherine Gong, John Budde, Varsha Krish, Farhad Imam, Raquel Puerta Fuentes, Amanda Cano, Marta Marquie, Merce Boada, Pau Pastor, Agustin Ruiz, Maria Victoria Fernández, David Bennett, Gregory Klein, Tony Wyss-Coray, Andrew J Saykin, Muhammad Ali, Carlos Cruchaga

PMC · DOI: 10.21203/rs.3.rs-6933762/v1 · 2025-07-31

## TL;DR

This paper introduces an AI framework using protein data from bodily fluids to accurately diagnose and analyze overlapping neurodegenerative diseases.

## Contribution

The novel contribution is an AI-based, multi-disease diagnostic framework validated across thousands of samples with high accuracy.

## Key findings

- AI models achieved high diagnostic accuracy (AUCs of 0.97 for CSF and 0.88 for plasma) comparable to traditional biomarkers.
- The framework enables classification of disease subtypes and identification of co-pathologies in individuals with conflicting clinical data.
- The model can prioritize individuals at risk of neurodegenerative diseases even when they are cognitively normal.

## Abstract

Neurodegenerative diseases (including Alzheimer’s disease, Parkinson’s disease, Frontotemporal dementia, and Dementia with Lewy bodies) pose diagnostic challenges due to overlapping pathology and clinical heterogeneity. We leveraged proteomic data from more than 21,000 cerebrospinal fluid and plasma samples to develop and validate explainable, boosting-based multi-disease AI classifiers. The models achieved weighted AUCs in the testing datasets of 0.97 for CSF and 0.88 for plasma, equivalent to traditional biomarkers. The model was validated with neuropathological and clinical data, confirming robust generalizability without any retraining. Using zero-shot learning, we classified disease subtypes including autosomal dominant AD and prodromal PD and clarified disease states for those with conflicting clinical information. The model also showed the ability to prioritize cognitively normal individuals at disease risk. This framework enabled the identification and quantification of continuous, individual-level disease probabilities that allow for the quantification of overlap across diseases and co-pathologies within an individual. Through this work, we establish a benchmark computational framework for enhancing diagnostic precision in NDs.

## Linked entities

- **Diseases:** Alzheimer’s disease (MONDO:0004975), Parkinson’s disease (MONDO:0005180), Frontotemporal dementia (MONDO:0010857), Dementia with Lewy bodies (MONDO:0007488)

## Full-text entities

- **Diseases:** Neurodegenerative Diseases (MESH:D019636), Dementia with Lewy bodies (MESH:D020961), Frontotemporal dementia (MESH:D057180), AD (MESH:D000544), PD (MESH:D010300)

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12324596/full.md

---
Source: https://tomesphere.com/paper/PMC12324596