# Identification and validation of an interpretable EEG-based machine learning model for the diagnosis of post-stroke cognitive impairment

**Authors:** Xinyang Wang, Jian Song, Weicheng Kong, Wei Wei, Haoran Shi, Peitao Xu, Yuqing Zhao, Jiayu Cai, Xiehua Xue

PMC · DOI: 10.3389/fnagi.2025.1700771 · Frontiers in Aging Neuroscience · 2026-01-12

## TL;DR

This study develops an interpretable EEG-based machine learning model to accurately detect cognitive impairment after stroke, with a web tool for individualized risk prediction.

## Contribution

An interpretable machine learning model using EEG data for early detection of post-stroke cognitive impairment is developed and validated.

## Key findings

- Seven EEG features were identified as most predictive of post-stroke cognitive impairment.
- The random forest model achieved high performance (AUC = 0.91, accuracy = 0.83) and was validated in an external cohort (AUC = 0.97, accuracy = 0.90).
- An accessible web tool was created for individualized risk prediction of PSCI.

## Abstract

Post-stroke cognitive impairment (PSCI) is a prevalent and disabling consequence of stroke, yet objective tools for its early identification are lacking. This study aimed to develop and validate an interpretable machine learning (ML) model based on electroencephalography (EEG) to support the early detection of PSCI.

We conducted a study involving 174 participants, including stroke patients with and without cognitive impairment and age-matched healthy controls. Resting-state EEG was acquired from all subjects, and multidimensional features, including power spectral ratios and microstate parameters, were extracted. Feature selection was performed using LASSO regression, random forest, and the Boruta algorithm. Five machine learning models were evaluated and compared based on their area under the curve (AUC), accuracy, Brier score, calibration plots, and decision curve analysis. Model interpretability was explained using SHAP (Shapley Additive Explanations). The final validated model was deployed as an interactive web-based application.

Seven EEG features were identified as most predictive of PSCI: the delta-plus-theta to alpha-plus-beta ratio (DTABR) in frontal, central, and global regions; the mean microstate duration of classes A and B (A-MMD, B-MMD); the mean frequency of microstate D (D-MFO); and the mean coverage of microstate A (A-MC). The random forest model demonstrated the highest performance (AUC = 0.91, accuracy = 0.83, specificity = 0.88, Brier score = 0.12), alongside satisfactory calibration and a positive net clinical benefit. The model was further validated on an independent external cohort (n = 42), showing robust predictive performance (AUC = 0.97, accuracy = 0.90). An accessible web tool was created for individualized risk prediction (https://eeg-predict.streamlit.app/).

The findings suggest that an interpretable EEG-based ML model can provide accurate early screening of PSCI. Integration of this approach into clinical workflows may support personalized rehabilitation strategies and optimize post-stroke care. Future studies are warranted to validate the model in larger, multicenter cohorts.

## Linked entities

- **Diseases:** stroke (MONDO:0005098)

## Full-text entities

- **Diseases:** stroke (MESH:D020521), PSCI (MESH:D003072), post (MESH:D000094025)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12832639/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12832639/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12832639/full.md

---
Source: https://tomesphere.com/paper/PMC12832639