Evaluation of Bias Towards Medical Professionals in Large Language Models
Xi Chen, Yang Xu, MingKe You, Li Wang, WeiZhi Liu, Jian Li

TL;DR
This study assesses biases in large language models regarding gender and race when evaluating medical professional resumes, revealing significant biases that could impact healthcare workforce diversity.
Contribution
It provides a comprehensive evaluation of bias in three major LLMs using a large dataset of simulated resumes, highlighting potential risks in healthcare applications.
Findings
All LLMs exhibited significant gender and racial biases.
Biases varied by medical specialty and model, favoring certain demographics.
LLMs' preferences often did not align with real-world demographics.
Abstract
This study evaluates whether large language models (LLMs) exhibit biases towards medical professionals. Fictitious candidate resumes were created to control for identity factors while maintaining consistent qualifications. Three LLMs (GPT-4, Claude-3-haiku, and Mistral-Large) were tested using a standardized prompt to evaluate resumes for specific residency programs. Explicit bias was tested by changing gender and race information, while implicit bias was tested by changing names while hiding race and gender. Physician data from the Association of American Medical Colleges was used to compare with real-world demographics. 900,000 resumes were evaluated. All LLMs exhibited significant gender and racial biases across medical specialties. Gender preferences varied, favoring male candidates in surgery and orthopedics, while preferring females in dermatology, family medicine, obstetrics and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterpreting and Communication in Healthcare
MethodsAttention Is All You Need · 7 Fastest Ways to Call American Airlines Reservations Number (USA Guide) · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention
