# Multimodal data for predictive medicine: algorithmic fusion of clinical data in anesthesiology and intensive care

**Authors:** Sebastian Daniel Boie, Niklas Giesa, Maria Sekutowicz, Rustam Zhumagambetov, Stefan Haufe, Elias Grünewald, Felix Balzer

PMC · DOI: 10.3389/fmed.2026.1746867 · Frontiers in Medicine · 2026-01-23

## TL;DR

This paper explores how combining different types of clinical data can improve outcome predictions in anesthesiology and intensive care using machine learning.

## Contribution

The paper introduces a structured analysis of multimodal fusion strategies for clinical data in anesthesiology and intensive care.

## Key findings

- Early fusion simplifies data integration by creating a unified tabular representation.
- Intermediate fusion enables powerful models by learning cross-modal dependencies.
- Late fusion offers modularity and robustness for real-time deployment with asynchronous data.

## Abstract

Anesthesiology and intensive care medicine are among the most data-rich fields of medicine, where accurate and timely outcome prediction or risk stratification is important. During patient care, heterogeneous data streams, including structured electronic health records, free-text documentation, and high-frequency physiologic time series are recorded. This provides a fertile ground for machine learning (ML) models to make individualized risk predictions. Yet, secondary use of routine data remains difficult due to heterogeneity, missingness, variable granularity, ambiguously defined outcomes, or poor representation of clinical concepts in routine data. Reproducibility and transparency are difficult to achieve with hospital-specific complex data pipelines. New complexities arise when combining different data modalities. This perspective article discusses three common modalities—tabular data, clinical text, and time series—and outlines data modality-specific challenges, data preprocessing strategies, and ML modeling approaches. We examine multimodal fusion strategies through the common taxonomy of early, intermediate, and late fusion. In early fusion, generated features are aggregated into a unified tabular representation, offering simplicity and often serve as first baseline prediction models. Intermediate fusion uses modality-specific encoders with shared layers to learn cross-modal dependencies. This strategy yields the most complex and powerful models. Late decision-level fusion combines outputs from modality-optimized models, providing modularity and robustness to missing modalities, leading to advantages for real-time deployment where data arrive asynchronously. The growth of multi-centric datasets and federated infrastructures may enable intermediate-fusion architectures and multimodal foundation models to better capture patient trajectories, supporting risk stratification and personalized therapy in perioperative and intensive care settings.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12876225/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12876225/full.md

## References

66 references — full list in the complete paper: https://tomesphere.com/paper/PMC12876225/full.md

---
Source: https://tomesphere.com/paper/PMC12876225