Stable predictions for health related anticausal prediction tasks affected by selection biases: the need to deconfound the test set features
Elias Chaibub Neto, Phil Snyder, Solveig K Sieberts, Larsson Omberg

TL;DR
This paper demonstrates that achieving stable predictions in health-related anticausal tasks requires deconfounding both training and test set features, especially under selection biases, to improve generalization across environments.
Contribution
It introduces the novel insight that deconfounding test set features, in addition to training data, enhances prediction stability in health-related machine learning tasks.
Findings
Deconfounding test features improves prediction stability.
Selection biases cause instability in associations across environments.
Deconfounding both sets leads to better generalization.
Abstract
In health related machine learning applications, the training data often corresponds to a non-representative sample from the target populations where the learners will be deployed. In anticausal prediction tasks, selection biases often make the associations between confounders and the outcome variable unstable across different target environments. As a consequence, the predictions from confounded learners are often unstable, and might fail to generalize in shifted test environments. Stable prediction approaches aim to solve this problem by producing predictions that are stable across unknown test environments. These approaches, however, are sometimes applied to the training data alone with the hope that training an unconfounded model will be enough to generate stable predictions in shifted test sets. Here, we show that this is insufficient, and that improved stability can be achieved by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)
