R-ConstraintBench: Evaluating LLMs on NP-Complete Scheduling
Raj Jain, Marc Wetter

TL;DR
This paper introduces R-ConstraintBench, a scalable framework for evaluating large language models on complex, NP-Complete scheduling problems with increasing constraints, revealing their limitations in high-constraint scenarios.
Contribution
We present R-ConstraintBench, a novel benchmark for assessing LLMs on resource-constrained scheduling problems with increasing complexity and constraint interactions.
Findings
Strong models perform well on simple precedence-only tasks.
Feasibility drops significantly with additional constraint types.
Constraint interactions, not graph depth, limit model performance.
Abstract
Effective scheduling under tight resource, timing, and operational constraints underpins large-scale planning across sectors such as capital projects, manufacturing, logistics, and IT fleet transitions. However, the reliability of large language models (LLMs) when reasoning under high-constraint regimes is insufficiently characterized. To address this gap, we present R-ConstraintBench, a scalable framework that evaluates models on Resource-Constrained Project Scheduling Problems (RCPSP), an NP-Complete feasibility class, while difficulty increases via linear growth in constraints. R-ConstraintBench incrementally increases non-redundant precedence constraints in Directed Acyclic Graphs (DAGs) and then introduces downtime, temporal windows, and disjunctive constraints. As an illustrative example, we instantiate the benchmark in a data center migration setting and evaluate multiple LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · Resource-Constrained Project Scheduling · Software System Performance and Reliability
