Cardiovascular Disease Prediction using Machine Learning: A Comparative Analysis

Risshab Srinivas Ramesh; Roshani T S Udupa; Monisha J; Kushi K K S

arXiv:2507.21898·cs.LG·July 30, 2025

Cardiovascular Disease Prediction using Machine Learning: A Comparative Analysis

Risshab Srinivas Ramesh, Roshani T S Udupa, Monisha J, Kushi K K S

PDF

TL;DR

This study analyzes a large CVD dataset to identify key risk factors and compares machine learning models, finding CatBoost to be the most accurate, while highlighting data challenges affecting prediction reliability.

Contribution

It provides a comprehensive statistical analysis and compares multiple ML models for CVD prediction, emphasizing CatBoost's superior performance and data preprocessing needs.

Findings

01

Age, blood pressure, and cholesterol are primary risk factors.

02

CatBoost achieved the highest accuracy of 0.734.

03

Data issues like outliers impact model reliability.

Abstract

Cardiovascular diseases (CVDs) are a main cause of mortality globally, accounting for 31% of all deaths. This study involves a cardiovascular disease (CVD) dataset comprising 68,119 records to explore the influence of numerical (age, height, weight, blood pressure, BMI) and categorical gender, cholesterol, glucose, smoking, alcohol, activity) factors on CVD occurrence. We have performed statistical analyses, including t-tests, Chi-square tests, and ANOVA, to identify strong associations between CVD and elderly people, hypertension, higher weight, and abnormal cholesterol levels, while physical activity (a protective factor). A logistic regression model highlights age, blood pressure, and cholesterol as primary risk factors, with unexpected negative associations for smoking and alcohol, suggesting potential data issues. Model performance comparisons reveal CatBoost as the top performer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.