Political Bias in LLMs: Unaligned Moral Values in Agent-centric Simulations
Simon M\"unker

TL;DR
This paper investigates how generative language models, when prompted with political personas, reflect or misrepresent human moral and political biases, revealing inconsistencies and weak alignment with real-world data.
Contribution
It introduces a method to adapt open-source language models to simulate political personas and analyzes their alignment with human moral responses, highlighting limitations in ideological representation.
Findings
Models show high response variance across repetitions.
Synthetic data poorly correlates with real human data.
Conservative personas especially fail to align with actual conservative populations.
Abstract
Contemporary research in social sciences increasingly utilizes state-of-the-art generative language models to annotate or generate content. While these models achieve benchmark-leading performance on common language tasks, their application to novel out-of-domain tasks remains insufficiently explored. To address this gap, we investigate how personalized language models align with human responses on the Moral Foundation Theory Questionnaire. We adapt open-source generative language models to different political personas and repeatedly survey these models to generate synthetic data sets where model-persona combinations define our sub-populations. Our analysis reveals that models produce inconsistent results across multiple repetitions, yielding high response variance. Furthermore, the alignment between synthetic data and corresponding human data from psychological studies shows a weak…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
MethodsALIGN
