Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers

Haoyu Wang; Yuliang Song; Tao Li; Zhiwei Deng; Yaqing Wang; Deepak Ramachandran; Eldan Cohen; Dan Roth

arXiv:2605.12421·cs.AI·May 13, 2026

Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers

Haoyu Wang, Yuliang Song, Tao Li, Zhiwei Deng, Yaqing Wang, Deepak Ramachandran, Eldan Cohen, Dan Roth

PDF

TL;DR

This paper evaluates how different representations and heuristic optimization strategies in LLM-generated combinatorial solvers affect correctness and efficiency, highlighting the risks of heuristic traps and advocating for formalization over optimization.

Contribution

It introduces the CP-SynC-XL benchmark and systematically compares solver paradigms, revealing the pitfalls of heuristic search optimization in LLM-based solver synthesis.

Findings

01

Python + OR-Tools achieves highest correctness across LLMs.

02

Heuristic search optimization yields minimal speed-ups and can reduce correctness.

03

Heuristic traps often cause regressions, leading to over-constraining or unverified bounds.

Abstract

Large Language Models (LLMs) struggle to solve complex combinatorial problems through direct reasoning, so recent neuro-symbolic systems increasingly use them to synthesize executable solvers. A central design question is how the LLM should represent the solver, and whether it should also attempt to optimize search. We introduce CP-SynC-XL, a benchmark of 100 combinatorial problems (4,577 instances), and evaluate three solver-construction paradigms: native algorithmic search (Python), constraint modeling through a Python solver API (Python + OR-Tools), and declarative constraint modeling (MiniZinc + OR-Tools). We find a consistent representational divergence: Python + OR-Tools attains the highest correctness across LLMs, while MiniZinc + OR-Tools has lower absolute coverage despite using the same OR-Tools back-end. Native Python is the most likely to return a schema-valid solution that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.