Large Language Models Portray Socially Subordinate Groups as More Homogeneous, Consistent with a Bias Observed in Humans
Messi H.J. Lee, Jacob M. Montgomery, Calvin K. Lai

TL;DR
This study reveals that large language models tend to portray socially subordinate groups as more homogeneous than dominant groups, reflecting a bias similar to human social perception, which could reinforce stereotypes.
Contribution
The paper identifies and analyzes a new form of bias in LLMs, showing they depict minority groups as more homogeneous, extending beyond stereotypical attribute associations.
Findings
ChatGPT portrays minority groups as more homogeneous than majority groups.
Gender differences in perceived homogeneity are small but present.
The bias varies across racial/ethnic groups and genders.
Abstract
Large language models (LLMs) are becoming pervasive in everyday life, yet their propensity to reproduce biases inherited from training data remains a pressing concern. Prior investigations into bias in LLMs have focused on the association of social groups with stereotypical attributes. However, this is only one form of human bias such systems may reproduce. We investigate a new form of bias in LLMs that resembles a social psychological phenomenon where socially subordinate groups are perceived as more homogeneous than socially dominant groups. We had ChatGPT, a state-of-the-art LLM, generate texts about intersectional group identities and compared those texts on measures of homogeneity. We consistently found that ChatGPT portrayed African, Asian, and Hispanic Americans as more homogeneous than White Americans, indicating that the model described racial minority groups with a narrower…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling · Hate Speech and Cyberbullying Detection
