Considerations for Distribution Shift Robustness of Diagnostic Models in Healthcare
Arno Blaas, Adam Goli\'nski, Andrew Miller, Luca Zappella,, J\"orn-Henrik Jacobsen, Christina Heinze-Deml

TL;DR
This paper examines how distribution shifts affect diagnostic models in healthcare, emphasizing the importance of covariate inclusion and causality-based approaches to improve robustness, supported by simulations and ECG data analysis.
Contribution
It applies causality theory to healthcare predictive modeling, demonstrating when covariate inclusion enhances robustness against distribution shifts.
Findings
Ignoring covariates leads to less robust models.
Invariant learning methods may not always improve robustness.
Including certain covariates can significantly enhance model stability.
Abstract
We consider robustness to distribution shifts in the context of diagnostic models in healthcare, where the prediction target , e.g., the presence of a disease, is causally upstream of the observations , e.g., a biomarker. Distribution shifts may occur, for instance, when the training data is collected in a domain with patients having particular demographic characteristics while the model is deployed on patients from a different demographic group. In the domain of applied ML for health, it is common to predict from without considering further information about the patient. However, beyond the direct influence of the disease on biomarker , a predictive model may learn to exploit confounding dependencies (or shortcuts) between and that are unstable under certain distribution shifts. In this work, we highlight a data generating mechanism common to healthcare…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
