The Pluralistic Moral Gap: Understanding Judgment and Value Differences between Humans and Large Language Models
Giuseppe Russo, Debora Nozza, Paul R\"ottger, Dirk Hovy

TL;DR
This paper investigates the alignment between large language models and human moral judgments, revealing a gap in value diversity and proposing a method to improve moral alignment and diversity in LLM outputs.
Contribution
It introduces the Moral Dilemma Dataset, analyzes the moral judgment gap, and proposes Dynamic Moral Profiling to enhance LLM moral alignment and value diversity.
Findings
Models align with human judgments only under high consensus
LLMs rely on fewer moral values than humans
DMP improves alignment by 64.3% and increases value diversity
Abstract
People increasingly rely on Large Language Models (LLMs) for moral advice, which may influence humans' decisions. Yet, little is known about how closely LLMs align with human moral judgments. To address this, we introduce the Moral Dilemma Dataset, a benchmark of 1,618 real-world moral dilemmas paired with a distribution of human moral judgments consisting of a binary evaluation and a free-text rationale. We treat this problem as a pluralistic distributional alignment task, comparing the distributions of LLM and human judgments across dilemmas. We find that models reproduce human judgments only under high consensus; alignment deteriorates sharply when human disagreement increases. In parallel, using a 60-value taxonomy built from 3,783 value expressions extracted from rationales, we show that LLMs rely on a narrower set of moral values than humans. These findings reveal a pluralistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Psychology of Moral and Emotional Judgment
