Conformity in Large Language Models
Xiaochen Zhu, Caiqi Zhang, Tom Stafford, Nigel Collier, Andreas Vlachos

TL;DR
This study investigates the conformity bias in large language models, revealing their tendency to align with majority responses, especially under uncertainty, and explores methods to reduce this bias for more reliable AI systems.
Contribution
The paper is the first to systematically analyze conformity in LLMs, demonstrating how factors like training and input influence bias and proposing interventions to mitigate it.
Findings
All tested models show conformity to majority responses.
Models are more likely to conform when uncertain about their predictions.
Instruction tuning reduces conformity susceptibility.
Abstract
The conformity effect describes the tendency of individuals to align their responses with the majority. Studying this bias in large language models (LLMs) is crucial, as LLMs are increasingly used in various information-seeking and decision-making tasks as conversation partners to improve productivity. Thus, conformity to incorrect responses can compromise their effectiveness. In this paper, we adapt psychological experiments to examine the extent of conformity in popular LLMs. Our findings reveal that all tested models exhibit varying levels of conformity toward the majority, regardless of their initial choice or correctness, across different knowledge domains. Notably, we are the first to show that LLMs are more likely to conform when they are more uncertain in their own prediction. We further explore factors that influence conformity, such as training paradigms and input…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsALIGN
