High-Dimension Human Value Representation in Large Language Models

Samuel Cahyawijaya; Delong Chen; Yejin Bang; Leila Khalatbari; Bryan; Wilie; Ziwei Ji; Etsuko Ishii; Pascale Fung

arXiv:2404.07900·cs.CL·March 27, 2025·1 cites

High-Dimension Human Value Representation in Large Language Models

Samuel Cahyawijaya, Delong Chen, Yejin Bang, Leila Khalatbari, Bryan, Wilie, Ziwei Ji, Etsuko Ishii, Pascale Fung

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces UniVaR, a high-dimensional, scalable neural representation of human values in large language models, enabling visualization and understanding of value prioritization across diverse languages and cultures.

Contribution

We propose UniVaR, a novel continuous, self-supervised high-dimensional representation of human values in LLMs, independent of architecture and training data.

Findings

01

UniVaR effectively visualizes value prioritization in 25 languages.

02

It reveals complex interactions between human values and language modeling.

03

The approach is evaluated on 15 open-source and commercial LLMs.

Abstract

The widespread application of LLMs across various tasks and fields has necessitated the alignment of these models with human values and preferences. Given various approaches of human value alignment, there is an urgent need to understand the scope and nature of human values injected into these LLMs before their deployment and adoption. We propose UniVaR, a high-dimensional neural representation of symbolic human value distributions in LLMs, orthogonal to model architecture and training data. This is a continuous and scalable representation, self-supervised from the value-relevant output of 8 LLMs and evaluated on 15 open-source and commercial LLMs. Through UniVaR, we visualize and explore how LLMs prioritize different values in 25 languages and cultures, shedding light on complex interplay between human values and language modeling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hltchkust/univar
pytorchOfficial

Videos

High-Dimension Human Value Representation in Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods