Many LLMs Are More Utilitarian Than One
Anita Keshmirian, Razan Baltaji, Babak Hemmatian, Hadi Asghari, Lav R. Varshney

TL;DR
This study investigates how large language models (LLMs) behave in moral dilemmas when acting alone versus in groups, revealing a Utilitarian Boost in group settings that influences moral judgments differently than in humans.
Contribution
The paper demonstrates that multi-agent LLM systems exhibit a Utilitarian Boost in moral judgments, with distinct mechanisms from humans, and explores factors affecting this effect.
Findings
LLMs rate moral violations as more acceptable in groups.
The Utilitarian Boost in LLMs differs mechanistically from humans.
Model differences influence the strength and occurrence of the Boost.
Abstract
Moral judgment is integral to large language models' (LLMs) social reasoning. As multi-agent systems gain prominence, it becomes crucial to understand how LLMs function when collaborating compared to operating as individual agents. In human moral judgment, group deliberation leads to a Utilitarian Boost: a tendency to endorse norm violations that inflict harm but maximize benefits for the greatest number of people. We study whether a similar dynamic emerges in multi-agent LLM systems. We test six models on well-established sets of moral dilemmas across two conditions: (1) Solo, where models reason independently, and (2) Group, where they engage in multi-turn discussions in pairs or triads. In personal dilemmas, where agents decide whether to directly harm an individual for the benefit of others, all models rated moral violations as more acceptable when part of a group, demonstrating a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychology of Moral and Emotional Judgment · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI
