ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization

Joseph Tso; Preston Schmittou; Quan Huynh; Jibran Hutchins

arXiv:2602.22465·cs.AI·March 2, 2026

ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization

Joseph Tso, Preston Schmittou, Quan Huynh, Jibran Hutchins

PDF

Open Access

TL;DR

This paper introduces ConstraintBench, a benchmark for evaluating large language models' ability to directly solve constrained optimization problems across multiple domains, revealing current models' limitations in feasibility and optimality.

Contribution

ConstraintBench provides a comprehensive evaluation framework for LLMs on direct constrained optimization, including ground-truth solutions verified by Gurobi, and highlights key challenges in feasibility and optimality.

Findings

01

Best model achieves 65.0% feasibility

02

Feasible solutions reach 89-96% of optimal objective

03

Models struggle with joint feasibility and optimality

Abstract

Large language models are increasingly applied to operational decision-making where the underlying structure is constrained optimization. Existing benchmarks evaluate whether LLMs can formulate optimization problems as solver code, but leave open a complementary question. Can LLMs directly produce correct solutions to fully specified constrained optimization problems without access to a solver? We introduce ConstraintBench, a benchmark for evaluating LLMs on direct constrained optimization across 10 operations research domains, with all ground-truth solutions verified by the Gurobi solver. Each task presents a natural-language scenario with entities, constraints, and an optimization objective; the model must return a structured solution that a deterministic verifier checks against every constraint and the solver-proven optimum. We evaluate six frontier models on 200 tasks and find that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications