On Grounded Planning for Embodied Tasks with Language Models

Bill Yuchen Lin; Chengsong Huang; Qian Liu; Wenda Gu; Sam Sommerer,; Xiang Ren

arXiv:2209.00465·cs.AI·July 18, 2023·1 cites

On Grounded Planning for Embodied Tasks with Language Models

Bill Yuchen Lin, Chengsong Huang, Qian Liu, Wenda Gu, Sam Sommerer,, Xiang Ren

PDF

Open Access 1 Datasets 1 Video

TL;DR

This paper investigates whether language models can generate grounded, executable plans for embodied tasks by introducing G-PlanET, a novel problem formulation, and an evaluation protocol, showing that environment encoding improves planning performance.

Contribution

It presents the first study on grounded planning with language models, introducing G-PlanET, an evaluation protocol, and demonstrating the benefits of environment encoding and iterative decoding strategies.

Findings

01

Tables improve planning accuracy.

02

Iterative decoding enhances plan quality.

03

Grounded planning performance varies with environment encoding.

Abstract

Language models (LMs) have demonstrated their capability in possessing commonsense knowledge of the physical world, a crucial aspect of performing tasks in everyday life. However, it remains unclear **whether LMs have the capacity to generate grounded, executable plans for embodied tasks.** This is a challenging task as LMs lack the ability to perceive the environment through vision and feedback from the physical environment. In this paper, we address this important research question and present the first investigation into the topic. Our novel problem formulation, named **G-PlanET**, inputs a high-level goal and a data table about objects in a specific environment, and then outputs a step-by-step actionable plan for a robotic agent to follow. To facilitate the study, we establish an **evaluation protocol** and design a dedicated metric to assess the quality of the plans. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

yuchenlin/G-PlanET
dataset· 145 dl
145 dl

Videos

On Grounded Planning for Embodied Tasks with Language Models· underline

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems