Do Large Language Models Discriminate in Hiring Decisions on the Basis   of Race, Ethnicity, and Gender?

Haozhe An; Christabel Acquaye; Colin Wang; Zongxia Li; Rachel Rudinger

arXiv:2406.10486·cs.CL·June 18, 2024

Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?

Haozhe An, Christabel Acquaye, Colin Wang, Zongxia Li, Rachel Rudinger

PDF

Open Access 1 Video

TL;DR

This study investigates whether large language models exhibit biases based on race, ethnicity, and gender in simulated hiring decisions, revealing tendencies to favor White applicants and sensitivity to prompt variations.

Contribution

The paper introduces a novel templatic prompting method to measure race and gender bias in LLMs' hiring decisions, highlighting bias patterns and their prompt sensitivity.

Findings

01

LLMs more likely accept White applicants over Hispanic ones

02

Acceptance rates vary significantly with different prompts

03

Biases are idiosyncratic and prompt-sensitive

Abstract

We examine whether large language models (LLMs) exhibit race- and gender-based name discrimination in hiring decisions, similar to classic findings in the social sciences (Bertrand and Mullainathan, 2004). We design a series of templatic prompts to LLMs to write an email to a named job applicant informing them of a hiring decision. By manipulating the applicant's first name, we measure the effect of perceived race, ethnicity, and gender on the probability that the LLM generates an acceptance or rejection email. We find that the hiring decisions of LLMs in many settings are more likely to favor White applicants over Hispanic applicants. In aggregate, the groups with the highest and lowest acceptance rates respectively are masculine White names and masculine Hispanic names. However, the comparative acceptance rates by group vary under different templatic settings, suggesting that LLMs'…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?· underline

Taxonomy

TopicsComputational and Text Analysis Methods