Large Language Models Reproduce Racial Stereotypes When Used for Text Annotation
Petter T\"ornberg

TL;DR
This study reveals that large language models systematically reproduce racial stereotypes in text annotation tasks, embedding societal biases into automated processes across various contexts.
Contribution
The paper demonstrates that LLMs encode racial stereotypes in annotation outcomes, highlighting biases in name-based and dialect-based assessments across multiple models and tasks.
Findings
Names associated with Black individuals are rated as more aggressive and gossipy.
Asian names are rated as more intelligent but less confident and sociable.
African American Vernacular English increases perceived toxicity and anger in text.
Abstract
Large language models (LLMs) are increasingly used for automated text annotation in tasks ranging from academic research to content moderation and hiring. Across 19 LLMs and two experiments totaling more than 4 million annotation judgments, we show that subtle identity cues embedded in text systematically bias annotation outcomes in ways that mirror racial stereotypes. In a names-based experiment spanning 39 annotation tasks, texts containing names associated with Black individuals are rated as more aggressive by 18 of 19 models and more gossipy by 18 of 19. Asian names produce a bamboo-ceiling profile: 17 of 19 models rate individuals as more intelligent, while 18 of 19 rate them as less confident and less sociable. Arab names elicit cognitive elevation alongside interpersonal devaluation, and all four minority groups are consistently rated as less self-disciplined. In a matched…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Computational and Text Analysis Methods · Names, Identity, and Discrimination Research
