# GRU-Based Deep Multimodal Fusion of Speech and Head-IMU Signals in Mixed Reality for Parkinson’s Disease Detection

**Authors:** Daria Hemmerling, Milosz Dudek, Justyna Krzywdziak, Magda Żbik, Wojciech Szecowka, Mateusz Daniol, Marek Wodzinski, Monika Rudzinska-Bar, Magdalena Wojcik-Pedziwiatr

PMC · DOI: 10.3390/s26010269 · Sensors (Basel, Switzerland) · 2026-01-01

## TL;DR

This study explores combining speech and head movement data in mixed reality to improve detection of Parkinson’s disease.

## Contribution

The paper introduces a GRU-based multimodal fusion method that integrates speech and inertial signals for PD detection.

## Key findings

- Voice data alone achieved a pooled AUC of ≈0.865 for PD detection.
- Gated early-fusion of speech and inertial signals improved AUC to ≈0.875.
- Motion data acted as a conditional improvement factor in movement-related tasks.

## Abstract

Parkinson’s disease (PD) alters both speech and movement, yet most automated assessments still treat these signals separately. We examined whether combining voice with head motion improves discrimination between patients and healthy controls (HC). Synchronous measurements of acoustic and inertial signals were collected using a HoloLens 2 headset. Data were obtained from 165 participants (72 PD/93 HC), following a standardized mixed-reality (MR) protocol. We benchmarked single-modality models against fusion strategies under 5-fold stratified cross-validation. Voice alone was robust (pooled AUC ≈ 0.865), while the inertial channel alone was near chance (AUC ≈ 0.497). Fusion provided a modest but repeatable improvement: gated early-fusion achieved the highest AUC (≈0.875), cross-attention fusion was comparable (≈0.873). Gains were task-dependent. While speech-dominated tasks were already well captured by audio, tasks that embed movement benefited from complementary inertial data. Proposed MR capture proved feasible within a single session and showed that motion acts as a conditional improvement factor rather than a sole predictor. The results outline a practical path to multimodal screening and monitoring for PD, preserving the reliability of acoustic biomarkers while integrating kinematic features when they matter.

## Linked entities

- **Diseases:** Parkinson’s disease (MONDO:0005180)

## Full-text entities

- **Diseases:** PD (MESH:D010300)
- **Chemicals:** GRU (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12788298/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12788298/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/PMC12788298/full.md

---
Source: https://tomesphere.com/paper/PMC12788298