Diagnosing Model Performance Under Distribution Shift
Tiffany Tianhui Cai, Hongseok Namkoong, Steve Yadlowsky

TL;DR
This paper introduces DISDE, a method to decompose performance drops of prediction models under distribution shifts into components related to data difficulty, feature-outcome relationships, and unseen examples, aiding diagnosis and improvement.
Contribution
We develop DISDE, a novel approach that attributes performance degradation under distribution shifts to specific causes, enabling targeted model improvements.
Findings
DISDE effectively decomposes performance drops in real-world datasets.
The method provides insights into why domain adaptation methods may fail.
Application to census and satellite data demonstrates practical utility.
Abstract
Prediction models can perform poorly when deployed to target distributions different from the training distribution. To understand these operational failure modes, we develop a method, called DIstribution Shift DEcomposition (DISDE), to attribute a drop in performance to different types of distribution shifts. Our approach decomposes the performance drop into terms for 1) an increase in harder but frequently seen examples from training, 2) changes in the relationship between features and outcomes, and 3) poor performance on examples infrequent or unseen during training. These terms are defined by fixing a distribution on while varying the conditional distribution of between training and target, or by fixing the conditional distribution of while varying the distribution on . In order to do this, we define a hypothetical distribution on consisting of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare
Methodsfail
