Query-Efficient Planning with Language Models

Gonzalo Gonzalez-Pumariega; Wayne Chen; Kushal Kedia; and Sanjiban; Choudhury

arXiv:2412.06162·cs.AI·December 10, 2024

Query-Efficient Planning with Language Models

Gonzalo Gonzalez-Pumariega, Wayne Chen, Kushal Kedia, and Sanjiban, Choudhury

PDF

Open Access 1 Repo 4 Reviews

TL;DR

This paper explores two frameworks for using large language models in planning tasks, demonstrating that generative planning with LLMs leads to fewer interactions and faster adaptation compared to heuristic methods.

Contribution

It introduces and compares two LLM-based planning frameworks, highlighting the efficiency and adaptability of generative planning over heuristic approaches.

Findings

01

Generative LLM planning reduces interactions significantly.

02

LLMs adapt planning strategies more rapidly with feedback.

03

Both frameworks outperform baseline methods.

Abstract

Planning in complex environments requires an agent to efficiently query a world model to find a feasible sequence of actions from start to goal. Recent work has shown that Large Language Models (LLMs), with their rich prior knowledge and reasoning capabilities, can potentially help with planning by searching over promising states and adapting to feedback from the world. In this paper, we propose and study two fundamentally competing frameworks that leverage LLMs for query-efficient planning. The first uses LLMs as a heuristic within a search-based planner to select promising nodes to expand and propose promising actions. The second uses LLMs as a generative planner to propose an entire sequence of actions from start to goal, query a world model, and adapt based on feedback. We show that while both approaches improve upon comparable baselines, using an LLM as a generative planner results…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 3Confidence 3

Strengths

1. Utilizing LLMs for efficient planning is an important and highly timely topic; 2. The paper attempts to connect the proposed methods with past theories; 3. Multiple experiments validate the effectiveness of the proposed methods for these tasks;

Weaknesses

1. Regarding the proposed framework, it appears to be somewhat informally defined. Is the task defined as an MDP or a POMDP? How is the keyword "query-efficient" emphasized throughout the framework? Is it reasonable to assume a deterministic world model? 2. Regarding the two proposed methods, there are many variable factors in their design, such as whether ToI-BFS needs to traverse every state at each step, whether there is feedback from the world model at each step, and the type of feedback pro

Reviewer 02Rating 3Confidence 3

Strengths

1. The authors provide a well-structured comparison of various planning approaches based on their success rates in achieving goals within a limited number of world model queries. 2. The analysis includes a comprehensive evaluation of LLM calls and token usage (Table 1), showcasing the efficiency of different methods. For instance, Boomerang requires significantly fewer LLM calls compared to ToI-BFS and ReAct, indicating its effectiveness in balancing query efficiency and computational resource u

Weaknesses

1. Methodologically the authors are not proposing anything new. Several works have already proposed the two methodologies explored by authors. (eg., [1]) 2. The generative planning approach with LLMs may be limited when scaling to more complex or longer planning horizons, as the model might forget early feedback or re-propose failed actions over extended sequences. 3. The limitations mentioned by the authors in Section 7 are significant.

Reviewer 03Rating 8Confidence 4

Strengths

The paper is well-organized and very clearly written. It gives the key finding right up front: LLM as a generative planner is more query efficient than planners using LLM as a heuristic. Because LLMs are more adaptive to feedback from the world model than a traditional planner using LLM as a heuristic. The key contributions are clearly defined: a. Framework for query-efficient planning using LLMs b. Two new algorithms: ToI and Boomerang c. Evaluation of LLM and classical planners’ query effic

Weaknesses

The main weakness is the omission of the question: the methods proposed use LLMs to improve query efficiency with a goal of reducing computational cost. Is there a net benefit to computational cost when the increased cost of the LLM calls is considered? (See "Questions" for more detail)

Reviewer 04Rating 5Confidence 4

Strengths

1. The paper is easy to read, because the main idea is quite straightforward (using LLMs as a heuristic or generative planners are arguably known methods which have been used in prior papers). 2. The experiments are well conducted and extensive. They are broken down such that each question can be answered directly from a figure/table. 3. Failure modes are shown in the paper to corroborate each method's contribution and capabilities.

Weaknesses

1. The main idea and motivation of the paper is quite simple. While this is not really an issue per se, I feel that there are several places in the paper where contributions are overclaimed. (a) The abstract mentioned that "we propose and study two fundamentally competing frameworks ... the first uses LLMs as a heuristic within a search-based planner to select promising nodes to expand and propose promising actions ... the second uses LLMs as a generative planner". However, I am quite sure this

Code & Models

Repositories

portal-cornell/llms-for-planning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Logic, Reasoning, and Knowledge · Machine Learning and Algorithms