# Comparison of logistic regression and machine learning methods for predicting early neurological deterioration after thrombolysis in patients with mild stroke

**Authors:** Chen Lou, Meiyun Zhang, Jingjing Li, Dongjuan Xu

PMC · DOI: 10.3389/fneur.2026.1703890 · Frontiers in Neurology · 2026-03-04

## TL;DR

This study compares logistic regression and machine learning models to predict neurological deterioration after stroke treatment, finding both methods perform similarly.

## Contribution

The study evaluates multiple machine learning and logistic regression models for predicting neurological deterioration after thrombolysis in mild stroke patients.

## Key findings

- The SVM model with upsampling achieved the highest AUC of 0.889 in the training set and 0.859 in the test set.
- Logistic regression and machine learning models showed comparable performance in predicting early neurological deterioration.
- Eighty out of 625 patients with mild stroke experienced early neurological deterioration after thrombolysis.

## Abstract

We aimed to explore the risk factors for early neurological deterioration after thrombolysis in patients with mild stroke. Machine learning model and logistic regression model were established. We compared them to facilitate early identification of patients with mild stroke who still experience early neurological deterioration after thrombolysis. It can alert the physician and clinical remedial measures can be prepared in advance.

We conducted a study on patients with mild stroke who underwent thrombolysis from April 1, 2017 to April 1, 2024 at emergency department. Four common machine learning methods-Extreme Gradient Boosting (XGBoost), K-Nearest Neighbors (KNN), Random Forest (RF), and Support Vector Machine (SVM)-were used to create predictive models based on the information of eligible participants. The unbalanced data was preprocessed using four different methods. Each machine learning model was paired with four preprocessing schemes, resulting in 16 workflows. Then, we selected the optimal machine learning model from them. Additionally, five methods were used to establish logistic regression models. The optimal logistic regression model was then selected from them.

A total of 625 patients with mild stroke were included in the study, among whom 80 experienced early neurological deterioration after thrombolysis. Through 10-fold stratified cross-validation and simulated annealing algorithm, the optimal model among the four machine learning methods was selected as the SVM model that balanced the data through upsampling in 16 workflows. The area under the curve (AUC) of the SVM model was 0.889 (95% CI: 0.853, 0.926) in the training set and 0.859 in the test set processed by upsampling. Among the five methods used to establish logistic regression models, model m4 was the optimal one, with an AUC of 0.848 in the test set.

We explored the risk factors influencing the occurrence of early neurologic deterioration after thrombolysis in patients with mild stroke. We also found that logistic regression model and machine learning model demonstrated comparable performance in this single-center retrospective dataset.

## Linked entities

- **Diseases:** stroke (MONDO:0005098)

## Full-text entities

- **Diseases:** stroke (MESH:D020521), neurologic deterioration (MESH:D009422)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12996063/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12996063/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/PMC12996063/full.md

---
Source: https://tomesphere.com/paper/PMC12996063