Handling missing values in clinical machine learning: Insights from an   expert study

Lena Stempfle; Arthur James; Julie Josse; Tobias Gauss; Fredrik D.; Johansson

arXiv:2411.09591·cs.LG·February 12, 2025

Handling missing values in clinical machine learning: Insights from an expert study

Lena Stempfle, Arthur James, Julie Josse, Tobias Gauss, Fredrik D., Johansson

PDF

Open Access

TL;DR

This study explores clinicians' preferences for interpretable machine learning models in clinical settings with missing data, highlighting the importance of models that natively handle missing values over traditional imputation methods.

Contribution

It provides empirical insights into clinicians' preferences and reasoning, emphasizing the need for IML models that incorporate clinical intuition and handle missing data inherently.

Findings

01

Clinicians prefer models that natively handle missing data over imputation.

02

Traditional imputation methods often conflict with clinicians' intuition.

03

Clinicians rely on observed features and medical experience rather than imputation.

Abstract

Inherently interpretable machine learning (IML) models offer valuable support for clinical decision-making but face challenges when features contain missing values. Traditional approaches, such as imputation or discarding incomplete records, are often impractical in scenarios where data is missing at test time. We surveyed 55 clinicians from 29 French trauma centers, collecting 20 complete responses to study their interaction with three IML models in a real-world clinical setting for predicting hemorrhagic shock with missing values. Our findings reveal that while clinicians recognize the value of interpretability and are familiar with common IML approaches, traditional imputation techniques often conflict with their intuition. Instead of imputing unobserved values, they rely on observed features combined with medical intuition and experience. As a result, methods that natively handle…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Explainable Artificial Intelligence (XAI) · Sepsis Diagnosis and Treatment

MethodsADaptive gradient method with the OPTimal convergence rate