On the Limit of Language Models as Planning Formalizers

Cassie Huang; Li Zhang

arXiv:2412.09879·cs.CL·June 3, 2025

On the Limit of Language Models as Planning Formalizers

Cassie Huang, Li Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper evaluates the ability of large language models to generate complete formal planning representations like PDDL from natural language descriptions, highlighting their strengths and limitations in formalizing and planning tasks.

Contribution

It systematically assesses LLMs' capacity to produce complete PDDL representations from natural descriptions, revealing their effectiveness and robustness compared to direct plan generation.

Findings

01

Most large models effectively formalize descriptions as PDDL.

02

Performance decreases as descriptions become more natural.

03

Models are robust to lexical perturbations.

Abstract

Large Language Models have been found to create plans that are neither executable nor verifiable in grounded environments. An emerging line of work demonstrates success in using the LLM as a formalizer to generate a formal representation of the planning domain in some language, such as Planning Domain Definition Language (PDDL). This formal representation can be deterministically solved to find a plan. We systematically evaluate this methodology while bridging some major gaps. While previous work only generates a partial PDDL representation, given templated, and therefore unrealistic environment descriptions, we generate the complete representation given descriptions of various naturalness levels. Among an array of observations critical to improve LLMs' formal planning abilities, we note that most large enough models can effectively formalize descriptions as PDDL, outperforming those…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cassiehuang22/llm-as-pddl-formalizer
pytorchOfficial

Videos

On the Limit of Language Models as Planning Formalizers· underline

Taxonomy

TopicsModel-Driven Software Engineering Techniques