Analysis and classification of main risk factors causing stroke in Shanxi Province
Junjie Liu, Yiyang Sun, Jing Ma, Jiachen Tu, Yuhui Deng, Ping He,, Huaxiong Huang, Xiaoshuang Zhou, Shixin Xu

TL;DR
This study analyzes risk factors for stroke in Shanxi Province using machine learning models on hospital and census data, identifying hypertension, inactivity, and overweight as top risks and enabling stroke probability prediction.
Contribution
It introduces an interpretable machine learning approach to rank and evaluate stroke risk factors specific to Shanxi Province, incorporating data cleaning and feature importance analysis.
Findings
Hypertension, inactivity, and overweight are top risk factors.
Machine learning models can predict individual stroke risk.
Feature importance analysis aligns with medical knowledge.
Abstract
In China, stroke is the first leading cause of death in recent years. It is a major cause of long-term physical and cognitive impairment, which bring great pressure on the National Public Health System. Evaluation of the risk of getting stroke is important for the prevention and treatment of stroke in China. A data set with 2000 hospitalized stroke patients in 2018 and 27583 residents during the year 2017 to 2020 is analyzed in this study. Due to data incompleteness, inconsistency, and non-structured formats, missing values in the raw data are filled with -1 as an abnormal class. With the cleaned features, three models on risk levels of getting stroke are built by using machine learning methods. The importance of "8+2" factors from China National Stroke Prevention Project (CSPP) is evaluated via decision tree and random forest models. Except for "8+2" factors the importance of features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurological Disorders and Treatments · Acute Ischemic Stroke Management · Traditional Chinese Medicine Studies
MethodsLogistic Regression
