Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Haonan Dong; Qiguan Feng; Kehan Jiang; Haoran Ye; Xin Zhang; Guojie Song

arXiv:2605.10365·cs.AI·May 12, 2026

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Haonan Dong, Qiguan Feng, Kehan Jiang, Haoran Ye, Xin Zhang, Guojie Song

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces Agent-ValueBench, a comprehensive benchmark for evaluating the values of autonomous agents across multiple domains, addressing a gap in existing value assessments limited to LLMs.

Contribution

It presents the first dedicated benchmark for agent values, featuring diverse environments, tasks, and expert-curated data, enabling systematic evaluation of agent value alignment.

Findings

01

Agent values show a cross-model homogeneity called the Value Tide.

02

The Value Tide is influenced by harness pull and deliberate steering.

03

Agent alignment is shifting from model prompt steering to harness and skill steering.

Abstract

Autonomous agents have rapidly matured as task executors and seen widespread deployment via harnesses such as OpenClaw. Safety concerns have rightly drawn growing research attention, and beneath them lie the values silently steering agent behavior. Existing value benchmarks, however, remain confined to LLMs, leaving agent values largely uncharted. From intuitive, empirical, and theoretical vantage points, we show that an agent's values diverge from those of its underlying LLM, and the agentic modality further introduces dataset-, evaluation-, and system-level challenges absent from text-only protocols. We close this gap with Agent-ValueBench, the first benchmark dedicated to agent values. It features 394 executable environments across 16 domains, offering 4,335 value-conflict tasks that cover 28 value systems and 332 dimensions. Every instance is co-synthesized through our purpose-built…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

valuebyte-ai/Agent-ValueBench
github

Datasets

Value4AI/Agent-ValueBench
dataset· 2.6k dl
2.6k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.