Can we trust deep learning models diagnosis? The impact of domain shift in chest radiograph classification
Eduardo H. P. Pooch, Pedro L. Ballester, Rodrigo C. Barros

TL;DR
This paper investigates how domain shift affects the reliability of deep learning models in chest radiograph classification, revealing significant performance drops across different datasets and highlighting the importance of dataset selection.
Contribution
It provides a comprehensive evaluation of domain shift effects on large chest radiograph datasets, emphasizing the impact on model generalization and reliability.
Findings
Training on one dataset and testing on another drastically reduces performance.
Models trained on CheXpert and MIMIC-CXR generalize better to other datasets.
High domain shift questions the trustworthiness of models trained on public datasets.
Abstract
While deep learning models become more widespread, their ability to handle unseen data and generalize for any scenario is yet to be challenged. In medical imaging, there is a high heterogeneity of distributions among images based on the equipment that generates them and their parametrization. This heterogeneity triggers a common issue in machine learning called domain shift, which represents the difference between the training data distribution and the distribution of where a model is employed. A high domain shift tends to implicate in a poor generalization performance from the models. In this work, we evaluate the extent of domain shift on four of the largest datasets of chest radiographs. We show how training and testing with different datasets (e.g., training in ChestX-ray14 and testing in CheXpert) drastically affects model performance, posing a big question over the reliability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Radiomics and Machine Learning in Medical Imaging · Lung Cancer Diagnosis and Treatment
