# Development and validation of an interpretable machine learning model for predicting cognitive impairment in patients with sepsis

**Authors:** Guisheng Liang, Mo Lu, Yurong Pan

PMC · DOI: 10.3389/fmed.2025.1753260 · Frontiers in Medicine · 2026-01-21

## TL;DR

This study created a machine learning model to predict cognitive impairment after sepsis and identified key risk factors like age and SOFA score.

## Contribution

The novel contribution is an interpretable machine learning model for predicting post-sepsis cognitive impairment with clear clinical insights.

## Key findings

- A random forest model achieved high predictive accuracy (AUC 0.947 in training, 0.895 in validation).
- SOFA score was the most influential predictor of cognitive impairment followed by age and APACHE II score.
- SHAP analysis provided transparent insights into the model's predictions.

## Abstract

Cognitive impairment is a common and debilitating complication after sepsis. This study aimed to develop and validate an interpretable machine learning (ML) model to predict post-sepsis cognitive impairment and identify key clinical risk factors.

A retrospective cohort of 866 adult sepsis patients treated in our hospital between January 2020 and January 2025 was analyzed. Cognitive function was assessed 1–3 months after discharge using the Montreal Cognitive Assessment (MoCA), with scores < 26 indicating impairment. Key predictors were selected via least absolute shrinkage and selection operator (LASSO) regression and the Boruta algorithm. Five ML models—logistic regression, extreme gradient boosting (XGBoost), random forest (RF), k-nearest neighbors (KNN), and decision tree (DT)—were developed and evaluated using area under the curve (AUC), accuracy, F1-score. SHapley Additive exPlanations (SHAP) values were applied for interpretability.

Cognitive impairment occurred in 195 patients (22.5%). Seven variables were identified as key predictors of cognitive impairment, including age, years of education, septic shock, benzodiazepine use, acute physiology and chronic health evaluation II (APACHE II) score, sequential organ failure assessment (SOFA) score, and interleukin-10 (IL-10) level. The RF model performed best, with AUCs of 0.947 (training set) and 0.895 (validation set), showing good calibration and clinical utility. SHAP analysis showed that SOFA score had the greatest influence on cognitive impairment, followed by age, APACHE II score, IL-10, and years of education.

Using SHAP analysis, the RF model provided clear insights into the key factors contributing to the model’s prediction of cognitive impairment after sepsis. The model not only achieved high predictive accuracy but also offered a transparent, data-driven tool to identify patients at elevated risk, potentially enabling timely interventions and tailored clinical management.

## Full-text entities

- **Genes:** IL10 (interleukin 10) [NCBI Gene 3586] {aka CSIF, GVHDS, IL-10, IL10A, TGIF}
- **Diseases:** organ failure (MESH:D009102), sepsis (MESH:D018805), septic shock (MESH:D012772), Cognitive impairment (MESH:D003072)
- **Chemicals:** benzodiazepine (MESH:D001569)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12868286/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12868286/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/PMC12868286/full.md

---
Source: https://tomesphere.com/paper/PMC12868286