Machine Learning Models for Predicting Smoking-Related Health Decline and Disease Risk
Vaskar Chakma, MD Jaheid Hasan Nerab, Abdur Rouf, Abu Sayed, Hossem MD Saim, Md. Nournabi Khan

TL;DR
This study evaluates machine learning models, especially Random Forest, for identifying high-risk smoking-related health decline using clinical data, achieving high accuracy and interpretability with key health markers.
Contribution
It systematically compares ML algorithms for smoking risk assessment, emphasizing interpretability and practical deployment over algorithmic novelty.
Findings
Random Forest achieved an AUC of 0.926 in risk classification.
Key health indicators like blood pressure and liver enzymes are crucial for prediction.
SHAP analysis identified the most influential health markers for smoking-related risk.
Abstract
Smoking continues to be a major preventable cause of death worldwide, affecting millions through damage to the heart, metabolism, liver, and kidneys. However, current medical screening methods often miss the early warning signs of smoking-related health problems, leading to late-stage diagnoses when treatment options become limited. This study presents a systematic comparative evaluation of machine learning approaches for smoking-related health risk assessment, emphasizing clinical interpretability and practical deployment over algorithmic innovation. We analyzed health screening data from 55,691 individuals, examining various health indicators, including body measurements, blood tests, and demographic information. We tested three advanced prediction algorithms - Random Forest, XGBoost, and LightGBM - to determine which could most accurately identify people at high risk. This study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmoking Behavior and Cessation · Health, Environment, Cognitive Aging · Cardiovascular Health and Risk Factors
