On the Limits of Innate Planning in Large Language Models
Charles Schepanowski, Charles Ling

TL;DR
This paper investigates the planning and reasoning abilities of large language models using the 8-puzzle, revealing significant limitations in internal state management and heuristic planning without external tools.
Contribution
It provides a systematic evaluation of LLMs on a classic planning task, highlighting their deficiencies and the need for mechanisms to maintain explicit state and structured search.
Findings
Models struggle with internal state representation and valid move generation.
External move validators do not enable models to solve puzzles.
All tested models exhibit weak heuristic planning and looping behaviors.
Abstract
Large language models (LLMs) achieve impressive results on many benchmarks, yet their capacity for planning and stateful reasoning remains unclear. We study these abilities directly, without code execution or other tools, using the 8-puzzle: a classic task that requires state tracking and goal-directed planning while allowing precise, step-by-step evaluation. Four models are tested under common prompting conditions (Zero-Shot, Chain-of-Thought, Algorithm-of-Thought) and with tiered corrective feedback. Feedback improves success rates for some model-prompt combinations, but many successful runs are long, computationally expensive, and indirect. We then examine the models with an external move validator that provides only valid moves. Despite this level of assistance, none of the models solve any puzzles in this setting. Qualitative analysis reveals two dominant deficits across all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education
