Large Language Models Can Solve Real-World Planning Rigorously with Formal Verification Tools
Yilun Hao, Yongchao Chen, Yang Zhang, Chuchu Fan

TL;DR
This paper introduces a formal verification-based framework that enables large language models to solve complex multi-constraint planning problems with high success rates and strong generalizability, surpassing previous limitations.
Contribution
The authors propose a novel LLM-based planning framework that formalizes planning as satisfiability problems and integrates sound solvers, achieving significant improvements in success rate and generalization.
Findings
Achieves 93.9% success rate on TravelPlanner benchmark
Successfully generalizes to unseen constraints and domains
Effectively identifies unsatisfiable queries and suggests modifications
Abstract
Large Language Models (LLMs) struggle to directly generate correct plans for complex multi-constraint planning problems, even with self-verification and self-critique. For example, a U.S. domestic travel planning benchmark TravelPlanner was proposed in Xie et al. (2024), where the best LLM OpenAI o1-preview can only find viable travel plans with a 10% success rate given all needed information. In this work, we tackle this by proposing an LLM-based planning framework that formalizes and solves complex multi-constraint planning problems as constrained satisfiability problems, which are further consumed by sound and complete satisfiability solvers. We start with TravelPlanner as the primary use case and show that our framework achieves a success rate of 93.9% and is effective with diverse paraphrased prompts. More importantly, our framework has strong zero-shot generalizability,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsEmirates Airlines Office in Dubai
