Knowing But Not Doing: Convergent Morality and Divergent Action in LLMs
Jen-tse Huang, Jiantong Qin, Xueli Qiu, Sharon Levy, Michelle R. Kaufman, Mark Dredze

TL;DR
This study investigates how large language models understand and act on human values, revealing high consistency in decisions but a significant gap between their knowledge of values and actual behavior, similar to humans.
Contribution
The paper introduces ValAct-15k, a new dataset for evaluating LLMs' value understanding and action, and provides empirical insights into their alignment and knowledge-action gap.
Findings
LLMs show near-perfect consistency in scenario decisions
Humans exhibit broad variability in value judgments
Both humans and LLMs have a knowledge-action gap in values
Abstract
Value alignment is central to the development of safe and socially compatible artificial intelligence. However, how Large Language Models (LLMs) represent and enact human values in real-world decision contexts remains under-explored. We present ValAct-15k, a dataset of 3,000 advice-seeking scenarios derived from Reddit, designed to elicit ten values defined by Schwartz Theory of Basic Human Values. Using both the scenario-based questions and the traditional value questionnaire, we evaluate ten frontier LLMs (five from U.S. companies, five from Chinese ones) and human participants (). We find near-perfect cross-model consistency in scenario-based decisions (Pearson ), contrasting sharply with the broad variability observed among humans (). Yet, both humans and LLMs show weak correspondence between self-reported and enacted values ($r = 0.4,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Computational and Text Analysis Methods
