Quantifying Gender Bias in Large Language Models: When ChatGPT Becomes a Hiring Manager

Nina Gerszberg; Janka Hamori; Andrew Lo

arXiv:2604.00011·cs.CY·April 2, 2026

Quantifying Gender Bias in Large Language Models: When ChatGPT Becomes a Hiring Manager

Nina Gerszberg, Janka Hamori, Andrew Lo

PDF

TL;DR

This paper investigates gender bias in large language models like ChatGPT within hiring scenarios, revealing biases in candidate evaluation and exploring prompt engineering for mitigation.

Contribution

It quantifies gender bias in LLMs during hiring tasks and evaluates prompt engineering as a method to reduce such biases.

Findings

01

LLMs tend to favor female candidates in hiring decisions.

02

Perceived qualifications for female candidates are higher, but offered pay is lower.

03

Prompt engineering can help mitigate gender bias in LLM outputs.

Abstract

The growing prominence of large language models (LLMs) in daily life has heightened concerns that LLMs exhibit many of the same gender-related biases as their creators. In the context of hiring decisions, we quantify the degree to which LLMs perpetuate societal biases and investigate prompt engineering as a bias mitigation technique. Our findings suggest that for a given resum\'e, an LLM is more likely to hire a female candidate and perceive them as more qualified, but still recommends lower pay relative to male candidates.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.