Considerations for Distribution Shift Robustness of Diagnostic Models in   Healthcare

Arno Blaas; Adam Goli\'nski; Andrew Miller; Luca Zappella,; J\"orn-Henrik Jacobsen; Christina Heinze-Deml

arXiv:2410.19575·stat.ML·October 28, 2024

Considerations for Distribution Shift Robustness of Diagnostic Models in Healthcare

Arno Blaas, Adam Goli\'nski, Andrew Miller, Luca Zappella,, J\"orn-Henrik Jacobsen, Christina Heinze-Deml

PDF

Open Access

TL;DR

This paper examines how distribution shifts affect diagnostic models in healthcare, emphasizing the importance of covariate inclusion and causality-based approaches to improve robustness, supported by simulations and ECG data analysis.

Contribution

It applies causality theory to healthcare predictive modeling, demonstrating when covariate inclusion enhances robustness against distribution shifts.

Findings

01

Ignoring covariates leads to less robust models.

02

Invariant learning methods may not always improve robustness.

03

Including certain covariates can significantly enhance model stability.

Abstract

We consider robustness to distribution shifts in the context of diagnostic models in healthcare, where the prediction target $Y$ , e.g., the presence of a disease, is causally upstream of the observations $X$ , e.g., a biomarker. Distribution shifts may occur, for instance, when the training data is collected in a domain with patients having particular demographic characteristics while the model is deployed on patients from a different demographic group. In the domain of applied ML for health, it is common to predict $Y$ from $X$ without considering further information about the patient. However, beyond the direct influence of the disease $Y$ on biomarker $X$ , a predictive model may learn to exploit confounding dependencies (or shortcuts) between $X$ and $Y$ that are unstable under certain distribution shifts. In this work, we highlight a data generating mechanism common to healthcare…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare