Learning to Unlearn: Building Immunity to Dataset Bias in Medical Imaging Studies
Ahmed Ashraf, Shehroz Khan, Nikhil Bhagwat, Mallar Chakravarty, Babak, Taati

TL;DR
This paper introduces a framework to reduce dataset bias in medical imaging models by unlearning study-specific features, aiming to improve cross-study generalization and focus on fundamental disease characteristics.
Contribution
The paper proposes a novel method to unlearn dataset membership, enhancing model generalization across different medical imaging studies without re-training.
Findings
Empirical evidence of dataset bias in medical imaging datasets.
The proposed unlearning framework improves cross-study generalization.
Models trained with this method focus on disease-relevant features rather than dataset-specific quirks.
Abstract
Medical imaging machine learning algorithms are usually evaluated on a single dataset. Although training and testing are performed on different subsets of the dataset, models built on one study show limited capability to generalize to other studies. While database bias has been recognized as a serious problem in the computer vision community, it has remained largely unnoticed in medical imaging research. Transfer learning thus remains confined to the re-use of feature representations requiring re-training on the new dataset. As a result, machine learning models do not generalize even when trained on imaging datasets that were captured to study the same variable of interest. The ability to transfer knowledge gleaned from one study to another, without the need for re-training, if possible, would provide reassurance that the models are learning knowledge fundamental to the problem under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Machine Learning and Data Classification
