CONDESION-BENCH: Conditional Decision-Making of Large Language Models in Compositional Action Space
Yeonjun Hwang, Sungyong Park, Minju Kim, Dongha Lee, Jinyoung Yeo

TL;DR
CONDESION-BENCH is a new benchmark for evaluating large language models' ability to make decisions within complex, condition-restricted, compositional action spaces, addressing limitations of previous simplified benchmarks.
Contribution
It introduces a novel benchmark that incorporates explicit conditions and compositional actions, providing a more realistic evaluation of LLM decision-making capabilities.
Findings
Benchmark reveals LLMs' strengths and weaknesses in conditional decision-making.
Oracle-based evaluation offers rigorous assessment of decision quality and condition adherence.
Addresses limitations of previous decision-making benchmarks.
Abstract
Large language models have been widely explored as decision-support tools in high-stakes domains due to their contextual understanding and reasoning capabilities. However, existing decision-making benchmarks rely on two simplifying assumptions: actions are selected from a finite set of pre-defined candidates, and explicit conditions restricting action feasibility are not incorporated into the decision-making process. These assumptions fail to capture the compositional structure of real-world actions and the explicit conditions that constrain their validity. To address these limitations, we introduce CONDESION-BENCH, a benchmark designed to evaluate conditional decision-making in compositional action space. In CONDESION-BENCH, actions are defined as allocations to decision variables and are restricted by explicit conditions at the variable, contextual, and allocation levels. By employing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
