Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging
Dovile Juodelyte, Yucheng Lu, Amelia Jim\'enez-S\'anchez, Sabrina, Bottazzi, Enzo Ferrante, Veronika Cheplygina

TL;DR
This study investigates how the choice of source dataset in transfer learning affects model robustness in medical imaging, revealing that ImageNet models overfit confounders more than RadImageNet models.
Contribution
The paper introduces the MICCAT taxonomy to analyze confounders and demonstrates that source dataset impacts overfitting and robustness in medical imaging models.
Findings
ImageNet models overfit confounders more than RadImageNet models.
Performance is similar between ImageNet and RadImageNet, but robustness differs.
Recommends reexamining model robustness when using ImageNet-pretrained models.
Abstract
Transfer learning has become an essential part of medical imaging classification algorithms, often leveraging ImageNet weights. The domain shift from natural to medical images has prompted alternatives such as RadImageNet, often showing comparable classification performance. However, it remains unclear whether the performance gains from transfer learning stem from improved generalization or shortcut learning. To address this, we conceptualize confounders by introducing the Medical Imaging Contextualized Confounder Taxonomy (MICCAT) and investigate a range of confounders across it -- whether synthetic or sampled from the data -- using two public chest X-ray and CT datasets. We show that ImageNet and RadImageNet achieve comparable classification performance, yet ImageNet is much more prone to overfitting to confounders. We recommend that researchers using ImageNet-pretrained models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Medical Imaging Techniques and Applications · AI in cancer detection
