An AI-Guided Data Centric Strategy to Detect and Mitigate Biases in Healthcare Datasets
Faris F. Gulamali, Ashwin S. Sawant, Lora Liharska, Carol R. Horowitz,, Lili Chan, Patricia H. Kovatch, Ira Hofer, Karandeep Singh, Lynne D., Richardson, Emmanuel Mensah, Alexander W Charney, David L. Reich, Jianying, Hu, Girish N. Nadkarni

TL;DR
This paper introduces AEquity, a data-centric, model-agnostic metric for detecting and mitigating racial bias in healthcare datasets, demonstrated through case studies in medical imaging and healthcare utilization prediction.
Contribution
The paper presents AEquity, a novel metric for assessing dataset bias, and demonstrates its effectiveness in identifying and reducing racial bias in healthcare data.
Findings
AEquity effectively detects racial bias in healthcare datasets.
Applying AEquity-guided interventions reduces bias in case studies.
The approach is model- and task-agnostic, broadening its applicability.
Abstract
The adoption of diagnosis and prognostic algorithms in healthcare has led to concerns about the perpetuation of bias against disadvantaged groups of individuals. Deep learning methods to detect and mitigate bias have revolved around modifying models, optimization strategies, and threshold calibration with varying levels of success. Here, we generate a data-centric, model-agnostic, task-agnostic approach to evaluate dataset bias by investigating the relationship between how easily different groups are learned at small sample sizes (AEquity). We then apply a systematic analysis of AEq values across subpopulations to identify and mitigate manifestations of racial bias in two known cases in healthcare - Chest X-rays diagnosis with deep convolutional neural networks and healthcare utilization prediction with multivariate logistic regression. AEq is a novel and broadly applicable metric that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealthcare cost, quality, practices · Global Cancer Incidence and Screening · Machine Learning in Healthcare
