Who Gets Cited? Gender- and Majority-Bias in LLM-Driven Reference Selection
Jiangen He

TL;DR
This paper investigates gender bias in LLM-driven reference selection, revealing persistent male and majority-group biases that vary across fields and are only modestly mitigated by prompt strategies, risking reinforcement of gender disparities.
Contribution
It systematically analyzes gender bias in LLMs' citation recommendations, highlighting biases and their dependence on pool size and field, and evaluates mitigation effectiveness.
Findings
LLMs prefer male-authored references
Bias increases with larger candidate pools
Bias varies across scientific disciplines
Abstract
Large language models (LLMs) are rapidly being adopted as research assistants, particularly for literature review and reference recommendation, yet little is known about whether they introduce demographic bias into citation workflows. This study systematically investigates gender bias in LLM-driven reference selection using controlled experiments with pseudonymous author names. We evaluate several LLMs (GPT-4o, GPT-4o-mini, Claude Sonnet, and Claude Haiku) by varying gender composition within candidate reference pools and analyzing selection patterns across fields. Our results reveal two forms of bias: a persistent preference for male-authored references and a majority-group bias that favors whichever gender is more prevalent in the candidate pool. These biases are amplified in larger candidate pools and only modestly attenuated by prompt-based mitigation strategies. Field-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
