Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations
Hammaad Adam, Ming Ying Yang, Kenrick Cato, Ioana Baldini, Charles, Senteio, Leo Anthony Celi, Jiaming Zeng, Moninder Singh, Marzyeh Ghassemi

TL;DR
This study reveals that machine learning models can infer patient race from clinical notes even when explicit indicators are removed, potentially leading to biased healthcare recommendations despite human experts' inability to do so.
Contribution
The paper demonstrates the presence of implicit racial information in clinical notes and its impact on bias in ML-driven healthcare decisions, highlighting a hidden source of bias.
Findings
Models can detect race from redacted clinical notes.
Humans cannot accurately predict race from the same notes.
Biases persist in model recommendations despite race redaction.
Abstract
Clinical notes are becoming an increasingly important data source for machine learning (ML) applications in healthcare. Prior research has shown that deploying ML models can perpetuate existing biases against racial minorities, as bias can be implicitly embedded in data. In this study, we investigate the level of implicit race information available to ML models and human experts and the implications of model-detectable differences in clinical notes. Our work makes three key contributions. First, we find that models can identify patient self-reported race from clinical notes even when the notes are stripped of explicit indicators of race. Second, we determine that human experts are not able to accurately predict patient race from the same redacted clinical notes. Finally, we demonstrate the potential harm of this implicit information in a simulation study, and show that models trained on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Colorectal Cancer Screening and Detection · Global Cancer Incidence and Screening
