Analysis of Optimality of Large Language Models on Planning Problems
Bernd Bohnet, Michael C. Mozer, Kevin Swersky, Wil Cunningham, Aaron Parisi, Kathleen Kenealy, Noah Fiedel

TL;DR
This paper investigates how large language models reason about complex planning problems, revealing their near-optimal performance and proposing hypotheses for their reasoning capabilities beyond traditional algorithms.
Contribution
It demonstrates that LLMs can approach optimal solutions in planning tasks and introduces hypotheses explaining their reasoning mechanisms.
Findings
LLMs outperform traditional satisficing planners in complex scenarios.
LLMs track theoretical optimality limits with high precision.
Classical algorithms struggle with large search spaces, but LLMs do not.
Abstract
Classic AI planning problems have been revisited in the Large Language Model (LLM) era, with a focus of recent benchmarks on success rates rather than plan efficiency. We examine the degree to which frontier models reason optimally versus relying on simple, heuristic, and possibly inefficient strategies. We focus on the Blocksworld domain involving towers of labeled blocks which have to be moved from an initial to a goal configuration via a set of primitive actions. We also study a formally equivalent task, the generalized Path-Star () graph, in order to isolate true topological reasoning from semantic priors. We systematically manipulate problem depth (the height of block towers), width (the number of towers), and compositionality (the number of goal blocks). Reasoning-enhanced LLMs significantly outperform traditional satisficing planners (e.g., LAMA) in complex, multi-goal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
