# Explainable multimodal feature fusion networks for Parkinson's disease prediction

**Authors:** Abishek Ravichandran, Tamilarasi Kathirvel Murugan, Logeswari Govindaraj, Vishal M

PMC · DOI: 10.3389/fdgth.2026.1771281 · 2026-02-27

## TL;DR

This paper introduces a multimodal AI system that combines handwriting, gait, and speech data to improve Parkinson's disease detection with high accuracy and interpretability.

## Contribution

A novel explainable multimodal deep learning framework for Parkinson's disease prediction with interpretable feature fusion and clinical transparency.

## Key findings

- The trimodal fusion model achieves 92% accuracy, outperforming unimodal models.
- Key contributors to PD prediction include handwriting tremors, gait asymmetries, and speech instabilities.
- The model shows strong performance with an AUC of 0.95 and AP of 0.96.

## Abstract

Parkinson's disease (PD) is a progressive neurodegenerative disorder characterized by motor and non-motor impairments, where early diagnosis remains challenging due to reliance on subjective clinical assessments. Recent artificial intelligence (AI)-based approaches have demonstrated promise in identifying subtle PD biomarkers from individual modalities such as speech, gait, and handwriting; however, unimodal systems often fail to capture the heterogeneity of the disease and provide limited interpretability. To address these limitations, this study proposes a multimodal deep learning framework that integrates handwriting, gait, and speech modalities using an early feature fusion strategy for robust and interpretable PD detection. Each modality is processed through a dedicated feature extraction pipeline using deep neural networks, followed by static feature concatenation and classification using an XGBoost model. Model transparency is enhanced using explainable AI (XAI) techniques, including SHapley Additive exPlanations (SHAP) and Gradient-weighted Class Activation Mapping (Grad-CAM), enabling clinical interpretability of modality- and feature-level contributions. Experimental evaluation on benchmark datasets demonstrates that the proposed trimodal fusion model achieves an accuracy of 92%, outperforming unimodal handwriting (91%), gait (90%), and speech (74%) models. The fusion framework attains a macro F1-score of 0.89, an area under the ROC curve (AUC) of 0.95, and an average precision (AP) of 0.96, indicating strong discriminative capability and robustness. Confusion matrix analysis reveals balanced sensitivity (90%) and specificity (89%) across classes. Explainability analysis confirms that handwriting tremor patterns, gait force asymmetries, and speech spectral instabilities are key contributors to PD prediction. These results highlight the effectiveness of explainable multimodal AI in delivering accurate, reliable, and clinically interpretable solutions for early PD detection.

## Linked entities

- **Diseases:** Parkinson's disease (MONDO:0005180)

## Full-text entities

- **Diseases:** neurodegenerative disorder (MESH:D019636), PD (MESH:D010300), motor and non-motor impairments (MESH:D000068079), tremor (MESH:D014202)

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12982457/full.md

---
Source: https://tomesphere.com/paper/PMC12982457