Assessing LLMs for Moral Value Pluralism
Noam Benkler, Drisana Mosaphir, Scott Friedman, Andrew Smart, Sonja, Schmer-Galunder

TL;DR
This paper develops a method using NLP to assess and quantify the implicit moral and cultural values in LLM outputs, revealing biases and misalignments with diverse demographics based on social science surveys.
Contribution
It introduces the Recognizing Value Resonance (RVR) model to evaluate implicit moral values in LLMs, enabling quantitative analysis of value alignment across cultures and demographics.
Findings
LLMs show Western-centric value biases.
They overestimate conservatism in non-Western countries.
They misrepresent gender and age-related values.
Abstract
The fields of AI current lacks methods to quantitatively assess and potentially alter the moral values inherent in the output of large language models (LLMs). However, decades of social science research has developed and refined widely-accepted moral value surveys, such as the World Values Survey (WVS), eliciting value judgments from direct questions in various geographies. We have turned those questions into value statements and use NLP to compute to how well popular LLMs are aligned with moral values for various demographics and cultures. While the WVS is accepted as an explicit assessment of values, we lack methods for assessing implicit moral and cultural values in media, e.g., encountered in social media, political rhetoric, narratives, and generated by AI systems such as LLMs that are increasingly present in our daily lives. As we consume online content and utilize LLM outputs, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCultural Differences and Values · Social and Intergroup Psychology
