# Identifying and Predicting Parkinson's Disease Subtypes through   Trajectory Clustering via Bipartite Networks

**Authors:** Sanjukta Krishnagopal, Rainer Von Coelln, Lisa M. Shulman, Michelle, Girvan

arXiv: 1906.05338 · 2020-07-01

## TL;DR

This paper introduces a novel network-based clustering algorithm called Trajectory Profile Clustering (TPC) for identifying Parkinson's disease subtypes and predicting disease progression using longitudinal data, achieving 74% accuracy in subtype prediction.

## Contribution

The study presents a new data-driven, network-based method for subtype identification and early prediction in Parkinson's disease, incorporating complex progression patterns and genetic data.

## Key findings

- Identified 3 PD subtypes with distinct progression profiles
- Achieved 74% accuracy in predicting patient subtypes at year 5
- Demonstrated seamless integration of genetic variability into the model

## Abstract

Parkinson's disease (PD) is a common neurodegenerative disease with a high degree of heterogeneity in its clinical features, rate of progression, and change of variables over time. In this work, we present a novel data-driven, network-based Trajectory Profile Clustering (TPC) algorithm for 1) identification of PD subtypes and 2) early prediction of disease progression in individual patients. Our subtype identification is based not only on PD variables, but also on their complex patterns of progression, providing a useful tool for the analysis of large heterogenous, longitudinal data. Specifically, we cluster patients based on the similarity of their trajectories through a time series of bipartite networks connecting patients to demographic, clinical, and genetic variables. We apply this approach to demographic and clinical data from the Parkinson's Progression Markers Initiative (PPMI) dataset and identify 3 patient clusters, consistent with 3 distinct PD subtypes, each with a characteristic variable progression profile. Additionally, TPC predicts an individual patient's subtype and future disease trajectory, based on baseline assessments. Application of our approach resulted in 74% accurate subtype prediction in year 5 in a test/validation cohort. Furthermore, we show that genetic variability can be integrated seamlessly in our TPC approach. In summary, using PD as a model for chronic progressive diseases, we show that TPC leverages high-dimensional longitudinal datasets for subtype identification and early prediction of individual disease subtype. We anticipate this approach will be broadly applicable to multidimensional longitudinal datasets in diverse chronic diseases.

---
Source: https://tomesphere.com/paper/1906.05338