Generalized Planning in PDDL Domains with Pretrained Large Language Models
Tom Silver, Soham Dan, Kavitha Srinivas, Joshua B. Tenenbaum, Leslie, Pack Kaelbling, Michael Katz

TL;DR
This paper explores using GPT-4 to synthesize Python programs as generalized planners in PDDL domains, demonstrating strong performance with minimal training tasks and emphasizing the importance of automated debugging.
Contribution
It introduces a novel approach of using GPT-4 for program synthesis as generalized planners in PDDL domains, incorporating CoT summarization and automated debugging techniques.
Findings
GPT-4 effectively generalizes across multiple PDDL domains.
Automated debugging significantly improves planning accuracy.
Two training tasks often suffice for strong generalization.
Abstract
Recent work has considered whether large language models (LLMs) can function as planners: given a task, generate a plan. We investigate whether LLMs can serve as generalized planners: given a domain and training tasks, generate a program that efficiently produces plans for other tasks in the domain. In particular, we consider PDDL domains and use GPT-4 to synthesize Python programs. We also consider (1) Chain-of-Thought (CoT) summarization, where the LLM is prompted to summarize the domain and propose a strategy in words before synthesizing the program; and (2) automated debugging, where the program is validated with respect to the training tasks, and in case of errors, the LLM is re-prompted with four types of feedback. We evaluate this approach in seven PDDL domains and compare it to four ablations and four baselines. Overall, we find that GPT-4 is a surprisingly powerful generalized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Natural Language Processing Techniques · Topic Modeling
Methods15 Ways to Contact How can i speak to someone at Delta Airlines · Multi-Head Attention · Attention Is All You Need · Cosine Annealing · Linear Layer · Weight Decay · Attention Dropout · Position-Wise Feed-Forward Layer · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia?
