AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

Jing Yao; Shitong Duan; Xiaoyuan Yi; Dongkuan Xu; Peng Zhang; Tun Lu; Ning Gu; Zhicheng Dou; Xing Xie

arXiv:2505.13531·cs.CY·March 9, 2026

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

Jing Yao, Shitong Duan, Xiaoyuan Yi, Dongkuan Xu, Peng Zhang, Tun Lu, Ning Gu, Zhicheng Dou, Xing Xie

PDF

Open Access 3 Reviews

TL;DR

AdAEM is a novel, adaptive evaluation method that automatically generates diverse test questions to better distinguish LLMs' value differences, addressing limitations of static benchmarks and enabling dynamic tracking of model alignment.

Contribution

We introduce AdAEM, a self-extensible, adaptive evaluation algorithm that probes LLMs' value boundaries to generate informative, diverse test questions for better comparison.

Findings

01

AdAEM effectively distinguishes LLMs based on their value inclinations.

02

The method maximizes information gain to produce diverse, controversial topics.

03

AdAEM tracks value dynamics of LLMs over development stages.

Abstract

Assessing Large Language Models'(LLMs) underlying value differences enables comprehensive comparison of their misalignment, cultural adaptability, and biases. Nevertheless, current value measurement methods face the informativeness challenge: with often outdated, contaminated, or generic test questions, they can only capture the orientations on comment safety values, e.g., HHH, shared among different LLMs, leading to indistinguishable and uninformative results. To address this problem, we introduce AdAEM, a novel, self-extensible evaluation algorithm for revealing LLMs' inclinations. Distinct from static benchmarks, AdAEM automatically and adaptively generates and extends its test questions. This is achieved by probing the internal value boundaries of a diverse set of LLMs developed across cultures and time periods in an in-context optimization manner. Such a process theoretically…

Peer Reviews

Decision·ICLR 2026 Oral

Reviewer 01Rating 8Confidence 3

Strengths

- The framework is novel, extensible, and generalizable - The framework and the generated dataset of 12k questions should be useful for researchers exploring value differences in LLMs - The writing is easy to follow - There is substantial analysis aiming to validate the effectiveness of the benchmark - Includes a discussion on how the authors think about values for LLMs

Weaknesses

- A lot of the analysis is conditioned on the Schwartz Value Survey. Though in principle the approach could work on other value frameworks, a proof of concept on a different set of questions would’ve strengthened the generalization capability of the framework. - The validity analysis claims construct validity but assumes o3-mini’s capability of generating text with a particular value present, which is not a given.

Reviewer 02Rating 4Confidence 3

Strengths

Rigorous, scalable measurement of LLM value orientations is a timely, high-impact question for safety, alignment, and governance communities. This paper provides a dynamic, adaptive evaluation of LLM values, helping to overcome the limitations of traditional static benchmarks—namely their inability to capture new events, the difficulty of updating datasets dynamically, and the reliance on manual maintenance. The central claims and proposed contributions are supported by well-organized benchmark

Weaknesses

1. Methodologically, selecting questions that maximize divergence in LLM responses does not necessarily best reveal their value preferences; it may instead induce new biases. This could be a significant shortcoming of the work and requires careful justification. 2. The EM/IM-like alternating procedure lacks a formal convergence or monotonic guarantee in the main text. 3. Too much empirical approximation of mathematical modeling injures solidity. 4. Any reasons on your choice of hyper-parameters?

Reviewer 03Rating 8Confidence 4

Strengths

The paper addresses a critical and well-defined problem in LLM evaluation. Moving from static to a dynamic, self-extensible benchmark (AdAEM) is a significant conceptual and practical contribution. The core idea of using in-context optimization to probe value boundaries across different cultures (diverse LLMs) and time periods (knowledge cutoffs) is clever and highly effective at generating informative questions. The authors provide a thorough empirical analysis. The results clearly show that

Weaknesses

The framework's current implementation relies solely on Schwartz's Theory of Basic Values. While well-justified, this is just one of many value frameworks, and the paper could benefit from a brief discussion on how AdAEM could be adapted to other theories (e.g., Moral Foundations Theory). The paper acknowledges the potential for misuse, but it is worth noting that a framework designed to find controversial topics could indeed be misused.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Topic Modeling · Machine Learning in Materials Science

MethodsSparse Evolutionary Training