Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches
Pablo Biedma, Xiaoyuan Yi, Linus Huang, Maosong Sun, Xing Xie

TL;DR
This paper introduces ValueLex, a novel framework that uncovers the unique, structured value system of large language models using interdisciplinary psychological methods, revealing core dimensions beyond human norms.
Contribution
The work pioneers a method to reconstruct LLMs' values from scratch, identifying a structured value system with core dimensions, distinct from human values, through a generative and analytical approach.
Findings
Identified three core value dimensions: Competence, Character, and Integrity.
Developed tailored projective tests for evaluating LLMs' value inclinations.
Revealed that LLMs possess a structured, non-human value system.
Abstract
Recent advancements in Large Language Models (LLMs) have revolutionized the AI field but also pose potential safety and ethical risks. Deciphering LLMs' embedded values becomes crucial for assessing and mitigating their risks. Despite extensive investigation into LLMs' values, previous studies heavily rely on human-oriented value systems in social sciences. Then, a natural question arises: Do LLMs possess unique values beyond those of humans? Delving into it, this work proposes a novel framework, ValueLex, to reconstruct LLMs' unique value system from scratch, leveraging psychological methodologies from human personality/value research. Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs, synthesizing a taxonomy that culminates in a comprehensive value framework via factor analysis and semantic clustering. We identify three core…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
