Missing Values and Imputation in Healthcare Data: Can Interpretable   Machine Learning Help?

Zhi Chen; Sarah Tan; Urszula Chajewska; Cynthia Rudin; Rich Caruana

arXiv:2304.11749·cs.LG·April 25, 2023·5 cites

Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Zhi Chen, Sarah Tan, Urszula Chajewska, Cynthia Rudin, Rich Caruana

PDF

Open Access

TL;DR

This paper explores how interpretable machine learning, specifically Explainable Boosting Machines, can improve understanding and handling of missing data in healthcare, potentially enhancing decision-making and reducing imputation risks.

Contribution

It introduces novel methods using interpretable EBMs to analyze missingness mechanisms and assess imputation risks in medical datasets.

Findings

01

EBMs provide insights into missingness causes

02

Proposed methods detect risks from imputation

03

Experiments show improved understanding in healthcare data

Abstract

Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the consequences may affect healthcare decisions. There are many methods in the literature for dealing with missing values, including state-of-the-art methods which often depend on black-box models for imputation. In this work, we show how recent advances in interpretable machine learning provide a new perspective for understanding and tackling the missing value problem. We propose methods based on high-accuracy glass-box Explainable Boosting Machines (EBMs) that can help users (1) gain new insights on missingness mechanisms and better understand the causes of missingness, and (2) detect -- or even alleviate -- potential risks introduced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Explainable Artificial Intelligence (XAI) · Chronic Disease Management Strategies