An Empirical Framework for Domain Generalization in Clinical Settings
Haoran Zhang, Natalie Dullerud, Laleh Seyyed-Kalantari, Quaid Morris,, Shalmali Joshi, Marzyeh Ghassemi

TL;DR
This paper evaluates the effectiveness of domain generalization methods in clinical machine learning, revealing limited improvements in real-world medical imaging but some benefits in specific clinical time series scenarios.
Contribution
It benchmarks eight domain generalization methods on clinical data and introduces a framework for realistic domain shift simulation in healthcare.
Findings
Limited out-of-distribution performance gains on medical imaging.
Some scenarios in clinical time series show improved generalization.
Provides best practices for domain generalization in healthcare.
Abstract
Clinical machine learning models experience significantly degraded performance in datasets not seen during training, e.g., new hospitals or populations. Recent developments in domain generalization offer a promising solution to this problem by creating models that learn invariances across environments. In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data. We introduce a framework to induce synthetic but realistic domain shifts and sampling bias to stress-test these methods over existing non-healthcare benchmarks. We find that current domain generalization methods do not consistently achieve significant gains in out-of-distribution performance over empirical risk minimization on real-world medical imaging data, in line with prior work on general imaging datasets. However, a subset of realistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Domain Adaptation and Few-Shot Learning · Topic Modeling
