Handling missing values in clinical machine learning: Insights from an expert study
Lena Stempfle, Arthur James, Julie Josse, Tobias Gauss, Fredrik D., Johansson

TL;DR
This study explores clinicians' preferences for interpretable machine learning models in clinical settings with missing data, highlighting the importance of models that natively handle missing values over traditional imputation methods.
Contribution
It provides empirical insights into clinicians' preferences and reasoning, emphasizing the need for IML models that incorporate clinical intuition and handle missing data inherently.
Findings
Clinicians prefer models that natively handle missing data over imputation.
Traditional imputation methods often conflict with clinicians' intuition.
Clinicians rely on observed features and medical experience rather than imputation.
Abstract
Inherently interpretable machine learning (IML) models offer valuable support for clinical decision-making but face challenges when features contain missing values. Traditional approaches, such as imputation or discarding incomplete records, are often impractical in scenarios where data is missing at test time. We surveyed 55 clinicians from 29 French trauma centers, collecting 20 complete responses to study their interaction with three IML models in a real-world clinical setting for predicting hemorrhagic shock with missing values. Our findings reveal that while clinicians recognize the value of interpretability and are familiar with common IML approaches, traditional imputation techniques often conflict with their intuition. Instead of imputing unobserved values, they rely on observed features combined with medical intuition and experience. As a result, methods that natively handle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Explainable Artificial Intelligence (XAI) · Sepsis Diagnosis and Treatment
MethodsADaptive gradient method with the OPTimal convergence rate
