# Prediction and Classification the Risk of Stroke Patients Using Machine Learning Techniques: A Retrospective Cross‐Sectional Study

**Authors:** Ghasem Alizadeh‐dizaj, Shiva Khoshsirat

PMC · DOI: 10.1002/hsr2.71419 · Health Science Reports · 2025-10-28

## TL;DR

This study uses machine learning to predict stroke risk and severity based on patient data and identifies key risk factors.

## Contribution

The study evaluates multiple machine learning algorithms for stroke prediction and identifies kNN as the most effective.

## Key findings

- kNN achieved the highest accuracy (97.30%) and other metrics in predicting stroke severity.
- Physical inactivity, high cholesterol, and cardiovascular disease were top risk factors for stroke severity.
- Machine learning proved effective in identifying risk factors and could enhance stroke management strategies.

## Abstract

Stroke is one of the most common causes of death and neurological disabilities in all societies. The use of machine learning techniques to create predictive models is very helpful in identifying people at risk to reduce the complications of the disease. The purpose of this study was to investigate the performance of machine learning algorithms and predict the risk of stroke in suspected stroke patients using decision tree based on the risk factors that affect it.

The study analyzed medical records of 1184 stroke‐suspected patients presenting at an Emergency Department using machine learning algorithms. Attributes such as age, primary diagnoses, gender, blood pressure, smoking, diabetes, and other relevant factors were considered. The data set was preprocessed to handle missing and incompatible data. Algorithms like Naïve Bayes, Neural Network, kNN, SVM, and Classification Tree were applied, with a training‐test data split of 70–30 using K‐fold Cross Validation.

Among the machine learning algorithms used, kNN demonstrated the highest accuracy (97.30%), sensitivity (98.75%), specificity (98.72%), and F1 criteria (98.66%) in predicting stroke severity. Physical inactivity, high cholesterol, cardiovascular disease, history of transient ischemic attack, and high blood pressure emerged as the most influential risk factors for stroke severity. Decision Tree analysis provided valuable insights into the relationship between risk factors and stroke severity.

machine learning techniques proved effective in identifying risk factors and predicting stroke severity, showing promise in enhancing stroke management strategies. The study highlighted the importance of physical inactivity and other key risk factors in stroke prediction. Consistency in risk factor importance across studies suggests common underlying factors, while acknowledging variations based on geographic and lifestyle factors.

## Linked entities

- **Diseases:** stroke (MONDO:0005098), cardiovascular disease (MONDO:0004995), diabetes (MONDO:0005015)

## Full-text entities

- **Diseases:** diabetes (MESH:D003920), neurological disabilities (MESH:D009069), cardiovascular disease (MESH:D002318), Stroke (MESH:D020521), ischemic attack (MESH:D002546), death (MESH:D003643)
- **Chemicals:** cholesterol (MESH:D002784)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12559919/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12559919/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12559919/full.md

---
Source: https://tomesphere.com/paper/PMC12559919