Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers
Haoyu Wang, Yuliang Song, Tao Li, Zhiwei Deng, Yaqing Wang, Deepak Ramachandran, Eldan Cohen, Dan Roth

TL;DR
This paper evaluates how different representations and heuristic optimization strategies in LLM-generated combinatorial solvers affect correctness and efficiency, highlighting the risks of heuristic traps and advocating for formalization over optimization.
Contribution
It introduces the CP-SynC-XL benchmark and systematically compares solver paradigms, revealing the pitfalls of heuristic search optimization in LLM-based solver synthesis.
Findings
Python + OR-Tools achieves highest correctness across LLMs.
Heuristic search optimization yields minimal speed-ups and can reduce correctness.
Heuristic traps often cause regressions, leading to over-constraining or unverified bounds.
Abstract
Large Language Models (LLMs) struggle to solve complex combinatorial problems through direct reasoning, so recent neuro-symbolic systems increasingly use them to synthesize executable solvers. A central design question is how the LLM should represent the solver, and whether it should also attempt to optimize search. We introduce CP-SynC-XL, a benchmark of 100 combinatorial problems (4,577 instances), and evaluate three solver-construction paradigms: native algorithmic search (Python), constraint modeling through a Python solver API (Python + OR-Tools), and declarative constraint modeling (MiniZinc + OR-Tools). We find a consistent representational divergence: Python + OR-Tools attains the highest correctness across LLMs, while MiniZinc + OR-Tools has lower absolute coverage despite using the same OR-Tools back-end. Native Python is the most likely to return a schema-valid solution that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
