ChatGPT is not A Man but Das Man: Representativeness and Structural Consistency of Silicon Samples Generated by Large Language Models

Dai Li; Linzhuo Li; Huilian Sophie Qiu

arXiv:2507.02919·cs.CL·July 8, 2025

ChatGPT is not A Man but Das Man: Representativeness and Structural Consistency of Silicon Samples Generated by Large Language Models

Dai Li, Linzhuo Li, Huilian Sophie Qiu

PDF

TL;DR

This paper critically examines the use of large language models like ChatGPT as substitutes for human opinion surveys, revealing significant structural inconsistencies and homogenization that undermine their representativeness and validity.

Contribution

It identifies key challenges in using LLMs for opinion sampling, including structural inconsistency and homogenization, and proposes an accuracy-optimization hypothesis to explain these issues.

Findings

01

LLMs show significant structural inconsistencies compared to human data.

02

Homogenization in LLM responses underrepresents minority opinions.

03

These issues question the validity of using LLMs as direct survey substitutes.

Abstract

Large language models (LLMs) in the form of chatbots like ChatGPT and Llama are increasingly proposed as "silicon samples" for simulating human opinions. This study examines this notion, arguing that LLMs may misrepresent population-level opinions. We identify two fundamental challenges: a failure in structural consistency, where response accuracy doesn't hold across demographic aggregation levels, and homogenization, an underrepresentation of minority opinions. To investigate these, we prompted ChatGPT (GPT-4) and Meta's Llama 3.1 series (8B, 70B, 405B) with questions on abortion and unauthorized immigration from the American National Election Studies (ANES) 2020. Our findings reveal significant structural inconsistencies and severe homogenization in LLM responses compared to human data. We propose an "accuracy-optimization hypothesis," suggesting homogenization stems from prioritizing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.