Fairness in LLM-Generated Surveys
Andr\'es Abeliuk, Vanessa Gaete, Naim Bro

TL;DR
This paper investigates biases in Large Language Models when used for generating surveys across different socio-demographic and geographic groups, revealing performance disparities rooted in training data biases and proposing a framework for fairness assessment.
Contribution
It introduces a novel framework for measuring socio-demographic biases in LLMs and highlights the importance of cross-cultural fairness in survey applications.
Findings
LLMs perform better on U.S. datasets due to training data bias.
Political identity and race affect prediction accuracy in the U.S.
Gender, education, and religion influence performance in Chile.
Abstract
Large Language Models (LLMs) excel in text generation and understanding, especially in simulating socio-political and economic patterns, serving as an alternative to traditional surveys. However, their global applicability remains questionable due to unexplored biases across socio-demographic and geographic contexts. This study examines how LLMs perform across diverse populations by analyzing public surveys from Chile and the United States, focusing on predictive accuracy and fairness metrics. The results show performance disparities, with LLM consistently outperforming on U.S. datasets. This bias originates from the U.S.-centric training data, remaining evident after accounting for socio-demographic differences. In the U.S., political identity and race significantly influence prediction accuracy, while in Chile, gender, education, and religious affiliation play more pronounced roles.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQualitative Comparative Analysis Research
