Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents

Zehong Wang; Fang Wu; Hongru Wang; Xiangru Tang; Bolian Li; Zhenfei Yin; Yijun Ma; Yiyang Li; Weixiang Sun; Xiusi Chen; Yanfang Ye

arXiv:2601.22311·cs.AI·February 2, 2026

Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents

Zehong Wang, Fang Wu, Hongru Wang, Xiangru Tang, Bolian Li, Zhenfei Yin, Yijun Ma, Yiyang Li, Weixiang Sun, Xiusi Chen, Yanfang Ye

PDF

Open Access

TL;DR

This paper analyzes why LLM-based agents struggle with long-horizon planning, identifying a fundamental mismatch in reasoning strategies, and introduces FLARE, a planning method that improves long-term decision making and outperforms standard reasoning approaches.

Contribution

The paper introduces FLARE, a planning-centric approach that incorporates explicit lookahead and value propagation, addressing reasoning failures in long-horizon decision making for LLM agents.

Findings

01

FLARE improves task performance across multiple benchmarks.

02

FLARE enables LLM agents to outperform GPT-4 with standard reasoning.

03

Reasoning-based policies tend to cause myopic commitments over long horizons.

Abstract

Large language model (LLM)-based agents exhibit strong step-by-step reasoning capabilities over short horizons, yet often fail to sustain coherent behavior over long planning horizons. We argue that this failure reflects a fundamental mismatch: step-wise reasoning induces a form of step-wise greedy policy that is adequate for short horizons but fails in long-horizon planning, where early actions must account for delayed consequences. From this planning-centric perspective, we study LLM-based agents in deterministic, fully structured environments with explicit state transitions and evaluation signals. Our analysis reveals a core failure mode of reasoning-based policies: locally optimal choices induced by step-wise scoring lead to early myopic commitments that are systematically amplified over time and difficult to recover from. We introduce FLARE (Future-aware Lookahead with Reward…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · AI-based Problem Solving and Planning