A data balancing approach towards design of an expert system for Heart   Disease Prediction

Rahul Karmakar; Udita Ghosh; Arpita Pal; Sattwiki Dey; Debraj Malik,; Priyabrata Sain

arXiv:2407.18606·cs.LG·July 30, 2024

A data balancing approach towards design of an expert system for Heart Disease Prediction

Rahul Karmakar, Udita Ghosh, Arpita Pal, Sattwiki Dey, Debraj Malik,, Priyabrata Sain

PDF

Open Access

TL;DR

This paper explores machine learning techniques, especially ensemble methods like Random Forest, combined with feature selection and oversampling, to improve the accuracy of heart disease prediction models.

Contribution

It introduces a comprehensive approach integrating multiple ML models, feature selection techniques, and oversampling to enhance heart disease prediction accuracy.

Findings

01

Random Forest achieved 99.83% accuracy.

02

Ensemble methods outperform individual classifiers.

03

Key predictors include smoking, blood pressure, cholesterol, and inactivity.

Abstract

Heart disease is a serious global health issue that claims millions of lives every year. Early detection and precise prediction are critical to the prevention and successful treatment of heart related issues. A lot of research utilizes machine learning (ML) models to forecast cardiac disease and obtain early detection. In order to do predictive analysis on "Heart disease health indicators " dataset. We employed five machine learning methods in this paper: Decision Tree (DT), Random Forest (RF), Linear Discriminant Analysis, Extra Tree Classifier, and AdaBoost. The model is further examined using various feature selection (FS) techniques. To enhance the baseline model, we have separately applied four FS techniques: Sequential Forward FS, Sequential Backward FS, Correlation Matrix, and Chi2. Lastly, K means SMOTE oversampling is applied to the models to enable additional analysis. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Artificial Intelligence in Healthcare

MethodsSynthetic Minority Over-sampling Technique. · Feature Selection