Algorithmic encoding of protected characteristics in image-based models for disease detection
Ben Glocker, Charles Jones, Melanie Bernhardt, Stefan Winzeck

TL;DR
This study investigates how protected characteristics like race and sex influence disease detection models from chest X-ray images, revealing biases and proposing methods to analyze and mitigate such disparities in AI healthcare tools.
Contribution
The paper introduces a new methodology combining test-set resampling, multitask learning, and model inspection to analyze and understand encoding of protected characteristics in image-based disease detection models.
Findings
Subgroup disparities in true and false positive rates are confirmed.
Biases are partially mitigated by correcting population and prevalence shifts.
Transfer learning alone is insufficient to determine if protected characteristics influence predictions.
Abstract
It has been rightfully emphasized that the use of AI for clinical decision making could amplify health disparities. An algorithm may encode protected characteristics, and then use this information for making predictions due to undesirable correlations in the (historical) training data. It remains unclear how we can establish whether such information is actually used. Besides the scarcity of data from underserved populations, very little is known about how dataset biases manifest in predictive models and how this may result in disparate performance. This article aims to shed some light on these issues by exploring new methodology for subgroup analysis in image-based disease detection models. We utilize two publicly available chest X-ray datasets, CheXpert and MIMIC-CXR, to study performance disparities across race and biological sex in deep learning models. We explore test set…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Colorectal Cancer Screening and Detection · Machine Learning in Healthcare
MethodsTest
