# Machine Learning Models for Mortality Prediction in Intensive Care Unit Patients With Ischemic Stroke Associated With Intracranial Artery Stenosis: Retrospective Cohort Study

**Authors:** Kun Zhang, Ruomeng Chen, Jingyi Yang, Yan Yan, Lijuan Liu, Chaoyue Meng, Peifang Li, Guoying Xing, Xiaoyun Liu

PMC · DOI: 10.2196/82042 · JMIR Cardio · 2026-02-24

## TL;DR

This study uses machine learning to predict mortality in ICU patients with ischemic stroke and artery stenosis, achieving good accuracy and identifying key risk factors.

## Contribution

Develops and evaluates interpretable machine learning models for mortality prediction in a specific ICU stroke population.

## Key findings

- LightGBM, Bagging, and logistic regression achieved an area under the curve of 0.82-0.83 and accuracy above 73%.
- Acute physiology score III, suspected infection, and age were identified as the most influential predictors of mortality.
- Higher physiological severity and comorbidity burden were consistently linked to increased mortality risk.

## Abstract

Mortality prediction in intensive care unit (ICU) patients with ischemic stroke complicated by intracranial artery stenosis or occlusion remains difficult. Conventional scoring systems often lack discriminatory power and fail to provide individualized risk estimates. Machine learning approaches have been increasingly explored to integrate diverse clinical features for prognostic modeling.

This study aims to develop and evaluate machine learning models for individualized mortality prediction in ICU patients with ischemic stroke associated with intracranial artery stenosis or occlusion.

Using the Medical Information Mart for Intensive Care IV (MIMIC-IV) database, we conducted a retrospective cohort study including 5280 adult ICU patients identified through International Classification of Diseases, Ninth and Tenth Revision (ICD-9/10) codes. Mortality status was determined based on the presence of a recorded date of death (dod) in the MIMIC-IV database. Patients with a documented dod were classified as deceased, whereas those without a recorded dod were classified as nondeceased. The primary outcome was all-cause mortality as recorded in the MIMIC-IV database, defined by the presence of a documented dod. Patients were randomly split into training (n=3696, 70%) and testing (n=1584, 30%) cohorts. Missing value imputation, correlation reduction, and multistep supervised feature selection (gradient boosting, BorutaShap, recursive feature elimination with cross-validation, LassoCV, and chi-square analysis) were performed exclusively within the training set and subsequently applied to the test set, resulting in 35 retained predictive features. Eight machine learning models—including light gradient boosting machine (LightGBM), Bagging (bootstrap aggregating), random forest, logistic regression, support vector machine, gradient boosting, adaptive boosting, and k-nearest neighbors—were trained with hyperparameter optimization using RandomizedSearchCV. Model performance was evaluated using area under the curve, accuracy, recall, precision, F1-score, and calibration curves. Shapley additive explanations were used for global and individual-level interpretability.

LightGBM, Bagging, and logistic regression demonstrated comparable discrimination, achieving an area under the curve of approximately 0.82‐0.83 and accuracy above 73% on the independent test set. LightGBM demonstrated balanced performance (recall 0.70; precision 0.72) and good calibration. Shapley additive explanations analysis identified acute physiology score III, suspected infection, Charlson comorbidity index, age, weight on admission, and red cell distribution width as the most influential predictors. Overall, higher physiological severity, greater comorbidity burden, and older age were consistently associated with increased observed mortality risk.

Machine learning models—including LightGBM and Bagging—provide interpretable predictions of all-cause mortality in ICU patients with ischemic stroke and intracranial arterial disease. These models highlight key prognostic features and may support mortality risk stratification. External validation and evaluation of workflow integration are warranted before clinical implementation.

## Linked entities

- **Diseases:** ischemic stroke (MONDO:1060198)

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, SH2B2 (SH2B adaptor protein 2) [NCBI Gene 10603] {aka APS}
- **Diseases:** Coma (MESH:D003128), neurological deterioration (MESH:D009422), infectious complications (MESH:D003141), sepsis (MESH:D018805), dementia (MESH:D003704), cardiovascular disease (MESH:D002318), Intracranial Artery Stenosis (MESH:D012078), Infection (MESH:D007239), Ischemic Stroke (MESH:D002544), Cerebrovascular diseases (MESH:D002561), death (MESH:D003643), hypertension (MESH:D006973), intracranial artery stenosis or occlusion (MESH:D001157), neurological deficits (MESH:D009461), acute physiology (MESH:D000208), APS-III (MESH:D016884), Organ Dysfunction (MESH:D009102), ML (MESH:D007859), pneumonia (MESH:D011014), Stroke (MESH:D020521), malignancy (MESH:D009369), diabetes (MESH:D003920), MIMIC-IV (MESH:C000657744), intracranial arterial disease (MESH:D020765), respiratory or systemic infections (MESH:D012141), critically ill (MESH:D016638), NIHSS (MESH:C538175), dyslipidemia (MESH:D050171)
- **Chemicals:** creatinine (MESH:D003404), glucose (MESH:D005947), calcium (MESH:D002118), DCA (-), triglycerides (MESH:D014280)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12931835/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12931835/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/PMC12931835/full.md

---
Source: https://tomesphere.com/paper/PMC12931835