Looking for Out-of-Distribution Environments in Multi-center Critical Care Data
Dimitris Spathis, Stephanie L. Hyland

TL;DR
This paper investigates the challenge of cross-hospital generalization in clinical machine learning, proposing methods to identify Out-of-Distribution environments and analyzing their impact on model performance.
Contribution
It introduces model-based and heuristic approaches to detect OoD environments in critical care data and systematically compares models with varying access to OoD information.
Findings
Access to OoD data does not improve performance
Limitations in defining OoD environments due to data harmonisation
Current evaluation methods are insufficient for robust clinical models
Abstract
Clinical machine learning models show a significant performance drop when tested in settings not seen during training. Domain generalisation models promise to alleviate this problem, however, there is still scepticism about whether they improve over traditional training. In this work, we take a principled approach to identifying Out of Distribution (OoD) environments, motivated by the problem of cross-hospital generalization in critical care. We propose model-based and heuristic approaches to identify OoD environments and systematically compare models with different levels of held-out information. We find that access to OoD data does not translate to increased performance, pointing to inherent limitations in defining potential OoD environments potentially due to data harmonisation and sampling. Echoing similar results with other popular clinical benchmarks in the literature, new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Emergency and Acute Care Studies · Sepsis Diagnosis and Treatment
