Evaluating Model Robustness and Stability to Dataset Shift
Adarsh Subbaswamy, Roy Adams, Suchi Saria

TL;DR
This paper introduces a framework for evaluating the robustness of machine learning models to dataset shifts using existing data, enabling safety assessments without additional data collection.
Contribution
The paper proposes a novel debiased estimator for analyzing model stability under dataset shifts, accommodating complex high-dimensional distributions and realistic shifts.
Findings
Estimator maintains $ oot{N}$-consistency with complex models
Framework effectively assesses stability in real medical risk prediction
Allows evaluation of safety without extra data collection
Abstract
As the use of machine learning in high impact domains becomes widespread, the importance of evaluating safety has increased. An important aspect of this is evaluating how robust a model is to changes in setting or population, which typically requires applying the model to multiple, independent datasets. Since the cost of collecting such datasets is often prohibitive, in this paper, we propose a framework for analyzing this type of stability using the available data. We use the original evaluation data to determine distributions under which the algorithm performs poorly, and estimate the algorithm's performance on the "worst-case" distribution. We consider shifts in user defined conditional distributions, allowing some distributions to shift while keeping other portions of the data distribution fixed. For example, in a healthcare context, this allows us to consider shifts in clinical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Healthcare Operations and Scheduling Optimization · Sepsis Diagnosis and Treatment
