DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints
Yinger Zhang, Shutong Jiang, Renhao Li, Jianhong Tu, Yang Su, Lianghao Deng, Xudong Guo, Chenxu Lv, Junyang Lin

TL;DR
DeepPlanning introduces a challenging benchmark for long-horizon agent planning involving complex constraints, revealing current LLM limitations and guiding future improvements in explicit reasoning and tool use.
Contribution
The paper presents DeepPlanning, a new benchmark for practical long-horizon planning with real-world constraints, highlighting the need for improved reasoning and tool integration in agentic LLMs.
Findings
Current LLMs struggle with long-horizon planning tasks.
Explicit reasoning patterns improve planning effectiveness.
Parallel tool use enhances efficiency in complex tasks.
Abstract
While agent evaluation has shifted toward long-horizon tasks, most benchmarks still emphasize local, step-level reasoning rather than the global constrained optimization (e.g., time and financial budgets) that demands genuine planning ability. Meanwhile, existing LLM planning benchmarks underrepresent the active information gathering and fine-grained local constraints typical of real-world settings. To address this, we introduce DeepPlanning, a challenging benchmark for practical long-horizon agent planning. It features multi-day travel planning and multi-product shopping tasks that require proactive information acquisition, local constrained reasoning, and global constrained optimization. Evaluations on DeepPlanning show that even frontier agentic LLMs struggle with these problems, highlighting the importance of reliable explicit reasoning patterns and parallel tool use for achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Constraint Satisfaction and Optimization · AI-based Problem Solving and Planning
