On the Limits of Innate Planning in Large Language Models

Charles Schepanowski; Charles Ling

arXiv:2511.21591·cs.AI·November 27, 2025

On the Limits of Innate Planning in Large Language Models

Charles Schepanowski, Charles Ling

PDF

Open Access

TL;DR

This paper investigates the planning and reasoning abilities of large language models using the 8-puzzle, revealing significant limitations in internal state management and heuristic planning without external tools.

Contribution

It provides a systematic evaluation of LLMs on a classic planning task, highlighting their deficiencies and the need for mechanisms to maintain explicit state and structured search.

Findings

01

Models struggle with internal state representation and valid move generation.

02

External move validators do not enable models to solve puzzles.

03

All tested models exhibit weak heuristic planning and looping behaviors.

Abstract

Large language models (LLMs) achieve impressive results on many benchmarks, yet their capacity for planning and stateful reasoning remains unclear. We study these abilities directly, without code execution or other tools, using the 8-puzzle: a classic task that requires state tracking and goal-directed planning while allowing precise, step-by-step evaluation. Four models are tested under common prompting conditions (Zero-Shot, Chain-of-Thought, Algorithm-of-Thought) and with tiered corrective feedback. Feedback improves success rates for some model-prompt combinations, but many successful runs are long, computationally expensive, and indirect. We then examine the models with an external move validator that provides only valid moves. Despite this level of assistance, none of the models solve any puzzles in this setting. Qualitative analysis reveals two dominant deficits across all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education