Quantifying Gender Bias in Large Language Models: When ChatGPT Becomes a Hiring Manager
Nina Gerszberg, Janka Hamori, Andrew Lo

TL;DR
This paper investigates gender bias in large language models like ChatGPT within hiring scenarios, revealing biases in candidate evaluation and exploring prompt engineering for mitigation.
Contribution
It quantifies gender bias in LLMs during hiring tasks and evaluates prompt engineering as a method to reduce such biases.
Findings
LLMs tend to favor female candidates in hiring decisions.
Perceived qualifications for female candidates are higher, but offered pay is lower.
Prompt engineering can help mitigate gender bias in LLM outputs.
Abstract
The growing prominence of large language models (LLMs) in daily life has heightened concerns that LLMs exhibit many of the same gender-related biases as their creators. In the context of hiring decisions, we quantify the degree to which LLMs perpetuate societal biases and investigate prompt engineering as a bias mitigation technique. Our findings suggest that for a given resum\'e, an LLM is more likely to hire a female candidate and perceive them as more qualified, but still recommends lower pay relative to male candidates.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
