# Advancing shock prediction: leveraging prior knowledge and self-controlled data for enhanced model accuracy and generalizability

**Authors:** Cheng-Yu Tsai, Xiu-Rong Huang, Po-Tsun Kuo, Tzu-Tao Chen, Yun-Kai Yeh, Kuan-Yuan Chen, Arnab Majumdar, Chien-Hua Tseng

PMC · DOI: 10.1186/s12911-025-03108-2 · 2025-07-14

## TL;DR

This study improves shock prediction in ICU patients by using physiological waveforms and medical knowledge, enabling early warning without blood tests.

## Contribution

A novel machine learning model using self-controlled data and feature engineering from physiological waveforms to predict shock one hour in advance.

## Key findings

- A weighted ensemble model achieved an AUC of 0.93 and 84.15% accuracy in predicting shock.
- Key predictive features included ECG heart rate variability and respiratory waveform characteristics.
- The model successfully predicted shock using only four physiological waveforms and no blood tests.

## Abstract

Timely intervention in shock is vital, as delays over one hour greatly increase mortality. This study aims to develop an enhanced machine learning model that improves predictive performance by utilizing self-controlled data and applying feature engineering informed by medical knowledge to physiological waveforms, enabling the prediction of shock one hour in advance without relying on blood tests.

Patient data and physiological waveforms were obtained from the Medical Information Mart for Intensive Care III (MIMIC-3) database. Shock was defined as a mean arterial pressure ≤ 65 mmHg for more than one minute, combined with serum lactate levels ≥ 2 mmol/L within 12 h before or after the hypotension event. Waveforms used for prediction were extracted from 30 min time-segment before a 1-hour period prior to the event. Self-controlled waveforms were obtained from the same patient either one day before or up to seven days after the shock event.

The study included 389 ICU patients who met the shock criteria and had complete physiological waveform data available for analysis. A total of 299 features were derived: 90 from arterial blood pressure (ABP), 89 from electrocardiogram (ECG), 112 from respiratory waveforms (RESP), and 8 from blood oxygen saturation (SpO2). The weighted ensemble model showed the best performance with an AUC of 0.93 and accuracy of 84.15%, and sensitivity of 79.64% in the testing set. The most predictive features included ECG_HRV_pNN50 (proportion of successive heartbeat intervals differing by more than 50 ms), RESP_Width_Mean (mean width of respiratory waveform), RESP_Cycle_Rate_Mean (mean respiratory cycle rate), ABP_TimeSBP2DBP_SampEn (sample entropy of systolic-diastolic intervals), and ABP_AmplitudeDBP_Median (median amplitude of diastolic peaks).

This study demonstrated the feasibility of predicting shock one hour before its onset using only four physiological waveforms, combined with feature engineering based on physiological concepts and self-sampling data. The model achieved a strong AUC and a high sensitivity.

Not applicable.

The online version contains supplementary material available at 10.1186/s12911-025-03108-2.

## Full-text entities

- **Diseases:** Shock (MESH:D012769), hypotension (MESH:D007022)
- **Chemicals:** lactate (MESH:D019344), oxygen (MESH:D010100)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12261771/full.md

---
Source: https://tomesphere.com/paper/PMC12261771