Language Models as Zero-Shot Planners: Extracting Actionable Knowledge   for Embodied Agents

Wenlong Huang; Pieter Abbeel; Deepak Pathak; Igor Mordatch

arXiv:2201.07207·cs.LG·March 9, 2022·160 cites

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper demonstrates that large language models can be used as zero-shot planners to generate executable action sequences for embodied agents, with a semantic translation step improving action admissibility.

Contribution

It introduces a novel method for grounding high-level natural language tasks into executable plans using LLMs without additional training.

Findings

01

LLMs can decompose high-level tasks into plans without training

02

Semantic translation improves plan executability in virtual environments

03

Human evaluation indicates a trade-off between plan correctness and executability

Abstract

Can world knowledge learned by large language models (LLMs) be used to act in interactive environments? In this paper, we investigate the possibility of grounding high-level tasks, expressed in natural language (e.g. "make breakfast"), to a chosen set of actionable steps (e.g. "open fridge"). While prior work focused on learning from explicit step-by-step examples of how to act, we surprisingly find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into mid-level plans without any further training. However, the plans produced naively by LLMs often cannot map precisely to admissible actions. We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions. Our evaluation in the recent VirtualHome environment shows that the resulting method substantially improves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huangwl18/language-planner
noneOfficial

Videos

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents (+Author)· youtube

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques