All models are local: time to replace external validation with recurrent local validation
Alex Youssef, Michael Pencina, Anshul Thakur, Tingting Zhu, David, Clifton, Nigam H. Shah

TL;DR
The paper argues that external validation is insufficient for clinical ML models and proposes a recurrent local validation paradigm inspired by MLOps, emphasizing site-specific reliability tests for safer, more adaptable model deployment.
Contribution
It introduces a novel paradigm of recurring local validation to replace external validation, addressing data variability and model safety in clinical ML.
Findings
External validation does not guarantee model safety or utility.
Recurring local validation improves model reliability across sites.
Site-specific tests prevent performance issues due to data shifts.
Abstract
External validation is often recommended to ensure the generalizability of ML models. However, it neither guarantees generalizability nor equates to a model's clinical usefulness (the ultimate goal of any clinical decision-support tool). External validation is misaligned with current healthcare ML needs. First, patient data changes across time, geography, and facilities. These changes create significant volatility in the performance of a single fixed model (especially for deep learning models, which dominate clinical ML). Second, newer ML techniques, current market forces, and updated regulatory frameworks are enabling frequent updating and monitoring of individual deployed model instances. We submit that external validation is insufficient to establish ML models' safety or utility. Proposals to fix the external validation paradigm do not go far enough. Continued reliance on it as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Electronic Health Records Systems · Healthcare Operations and Scheduling Optimization
MethodsTest
