Comparison of Missing Data Imputation Methods using the Framingham Heart   study dataset

Konstantinos Psychogyios; Loukas Ilias; Dimitris Askounis

arXiv:2210.03154·cs.LG·November 8, 2022·1 cites

Comparison of Missing Data Imputation Methods using the Framingham Heart study dataset

Konstantinos Psychogyios, Loukas Ilias, Dimitris Askounis

PDF

Open Access

TL;DR

This study compares advanced GAN and Autoencoder-based missing data imputation methods on the Framingham Heart Study dataset, demonstrating significant improvements over traditional techniques in both imputation accuracy and predictive performance.

Contribution

It introduces modified GAN and Autoencoder methods for missing data imputation and evaluates their effectiveness on medical datasets, showing notable performance gains.

Findings

01

Improvements of 0.20 in normalized RMSE

02

7.00% increase in AUROC

03

2.50% higher F1-score in post-imputation prediction

Abstract

Cardiovascular disease (CVD) is a class of diseases that involve the heart or blood vessels and according to World Health Organization is the leading cause of death worldwide. EHR data regarding this case, as well as medical cases in general, contain missing values very frequently. The percentage of missingness may vary and is linked with instrument errors, manual data entry procedures, etc. Even though the missing rate is usually significant, in many cases the missing value imputation part is handled poorly either with case-deletion or with simple statistical approaches such as mode and median imputation. These methods are known to introduce significant bias, since they do not account for the relationships between the dataset's variables. Within the medical framework, many datasets consist of lab tests or patient medical tests, where these relationships are present and strong. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare

MethodsTest