CONDESION-BENCH: Conditional Decision-Making of Large Language Models in Compositional Action Space

Yeonjun Hwang; Sungyong Park; Minju Kim; Dongha Lee; Jinyoung Yeo

arXiv:2604.09029·cs.CL·April 13, 2026

CONDESION-BENCH: Conditional Decision-Making of Large Language Models in Compositional Action Space

Yeonjun Hwang, Sungyong Park, Minju Kim, Dongha Lee, Jinyoung Yeo

PDF

TL;DR

CONDESION-BENCH is a new benchmark for evaluating large language models' ability to make decisions within complex, condition-restricted, compositional action spaces, addressing limitations of previous simplified benchmarks.

Contribution

It introduces a novel benchmark that incorporates explicit conditions and compositional actions, providing a more realistic evaluation of LLM decision-making capabilities.

Findings

01

Benchmark reveals LLMs' strengths and weaknesses in conditional decision-making.

02

Oracle-based evaluation offers rigorous assessment of decision quality and condition adherence.

03

Addresses limitations of previous decision-making benchmarks.

Abstract

Large language models have been widely explored as decision-support tools in high-stakes domains due to their contextual understanding and reasoning capabilities. However, existing decision-making benchmarks rely on two simplifying assumptions: actions are selected from a finite set of pre-defined candidates, and explicit conditions restricting action feasibility are not incorporated into the decision-making process. These assumptions fail to capture the compositional structure of real-world actions and the explicit conditions that constrain their validity. To address these limitations, we introduce CONDESION-BENCH, a benchmark designed to evaluate conditional decision-making in compositional action space. In CONDESION-BENCH, actions are defined as allocations to decision variables and are restricted by explicit conditions at the variable, contextual, and allocation levels. By employing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.