Towards Cross-lingual Values Judgment: A Consensus-Pluralism Perspective

Yukun Chen; Xinyu Zhang; Boyi Deng; Jialong Tang; Yu Wan; Fei Huang; Yuxi Zhou; Baosong Yang; Yiming Li

arXiv:2602.17283·cs.CL·May 12, 2026

Towards Cross-lingual Values Judgment: A Consensus-Pluralism Perspective

Yukun Chen, Xinyu Zhang, Boyi Deng, Jialong Tang, Yu Wan, Fei Huang, Yuxi Zhou, Baosong Yang, Yiming Li

PDF

1 Repo

TL;DR

This paper introduces X-Value, a comprehensive benchmark for evaluating large language models' ability to judge deep-level values across multiple languages, addressing a critical gap in multilingual AI evaluation.

Contribution

It presents a novel two-stage human-AI annotation framework and the first cross-lingual values judgment benchmark, X-Value, covering 14 languages and 7 global issue categories.

Findings

01

LLMs show limitations in cross-lingual values judgment accuracy.

02

Performance varies significantly across languages and issue categories.

03

Current models need improvement in values-aware content evaluation.

Abstract

As large language models (LLMs) are employed worldwide, existing evaluation paradigms for their multilingual capabilities primarily focus on factual task performance, neglecting the ability to judge content's deep-level values across multiple languages. To bridge this gap, we first reveal two primary challenges in constructing values judgment benchmarks, cultural diversity and disciplinary complexity, and propose a novel two-stage human-AI collaborative annotation framework to alleviate them. This framework identifies the issue scope and nature, establishes specific annotation criteria, and utilizes multiple LLMs for final review. Building upon this framework, we introduce \textbf{X-Value}, the first \textit{Cross-lingual Values Judgment Benchmark} designed to evaluate the capability of LLMs in judging deep-level values of content. X-Value comprises 4,750 Question-Answer pairs across 14…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/datasets/Whitolf/X-Value
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.