Deep Value Benchmark: Measuring Whether Models Generalize Deep Values or Shallow Preferences

Joshua Ashkinaze; Hua Shen; Saipranav Avula; Eric Gilbert; Ceren Budak

arXiv:2511.02109·cs.AI·January 13, 2026

Deep Value Benchmark: Measuring Whether Models Generalize Deep Values or Shallow Preferences

Joshua Ashkinaze, Hua Shen, Saipranav Avula, Eric Gilbert, Ceren Budak

PDF

Open Access

TL;DR

The paper introduces the Deep Value Benchmark (DVB), an evaluation framework to test if large language models genuinely learn human values or just surface-level preferences, revealing models' limited deep value generalization.

Contribution

The paper presents the DVB framework with a novel experimental design to measure models' ability to generalize deep human values over superficial preferences.

Findings

01

Average DVGR across models is 0.30, below chance.

02

Larger models tend to have slightly lower DVGR.

03

Models generally struggle to generalize deep values reliably.

Abstract

We introduce the Deep Value Benchmark (DVB), an evaluation framework that directly tests whether large language models (LLMs) learn fundamental human values or merely surface-level preferences. This distinction is critical for AI alignment: Systems that capture deeper values are likely to generalize human intentions robustly, while those that capture only superficial patterns in preference data risk producing misaligned behavior. The DVB uses a novel experimental design with controlled confounding between deep values (e.g., moral principles) and shallow features (e.g., superficial attributes). In the training phase, we expose LLMs to human preference data with deliberately correlated deep and shallow features -- for instance, where a user consistently prefers (non-maleficence, formal language) options over (justice, informal language) alternatives. The testing phase then breaks these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Mobile Crowdsensing and Crowdsourcing