Large language models that replace human participants can harmfully misportray and flatten identity groups
Angelina Wang, Jamie Morgenstern, John P. Dickerson

TL;DR
This paper critically examines the limitations of large language models in accurately representing social identities, highlighting potential harms when used as replacements for human participants in social science research.
Contribution
It analytically and empirically demonstrates that current LLMs tend to misportray and flatten demographic identities, raising concerns about their use in socially sensitive applications.
Findings
LLMs often misrepresent demographic groups
Empirical evidence from 3200 participants across 16 identities
Inference techniques can reduce, but not eliminate, harms
Abstract
Large language models (LLMs) are increasing in capability and popularity, propelling their application in new domains -- including as replacements for human participants in computational social science, user testing, annotation tasks, and more. In many settings, researchers seek to distribute their surveys to a sample of participants that are representative of the underlying human population of interest. This means in order to be a suitable replacement, LLMs will need to be able to capture the influence of positionality (i.e., relevance of social identities like gender and race). However, we show that there are two inherent limitations in the way current LLMs are trained that prevent this. We argue analytically for why LLMs are likely to both misportray and flatten the representations of demographic groups, then empirically show this on 4 LLMs through a series of human studies with 3200…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
