Bias in Large Language Models Across Clinical Applications: A Systematic Review
Thanathip Suenghataiphorn, Narisara Tribuddharat, Pojsakorn, Danpanichkul, Narathorn Kulthamrongsri

TL;DR
This systematic review highlights the widespread presence of bias in large language models used in healthcare, emphasizing the need for rigorous evaluation and mitigation strategies to ensure equitable patient care.
Contribution
It provides a comprehensive analysis of bias sources, manifestations, and impacts in clinical LLMs, and underscores the importance of ongoing monitoring and mitigation.
Findings
Bias is pervasive across clinical LLM applications.
Data and model-related biases significantly contribute to disparities.
Biases affect attributes like race, gender, and age, impacting clinical outcomes.
Abstract
Background: Large language models (LLMs) are rapidly being integrated into healthcare, promising to enhance various clinical tasks. However, concerns exist regarding their potential for bias, which could compromise patient care and exacerbate health inequities. This systematic review investigates the prevalence, sources, manifestations, and clinical implications of bias in LLMs. Methods: We conducted a systematic search of PubMed, OVID, and EMBASE from database inception through 2025, for studies evaluating bias in LLMs applied to clinical tasks. We extracted data on LLM type, bias source, bias manifestation, affected attributes, clinical task, evaluation methods, and outcomes. Risk of bias was assessed using a modified ROBINS-I tool. Results: Thirty-eight studies met inclusion criteria, revealing pervasive bias across various LLMs and clinical applications. Both data-related bias (from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Genomics and Rare Diseases
