Natural Language as Policies: Reasoning for Coordinate-Level Embodied   Control with LLMs

Yusuke Mikami; Andrew Melnik; Jun Miura; Ville Hautam\"aki

arXiv:2403.13801·cs.RO·April 9, 2024·1 cites

Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs

Yusuke Mikami, Andrew Melnik, Jun Miura, Ville Hautam\"aki

PDF

Open Access

TL;DR

This paper presents a novel approach where large language models interpret natural language descriptions to directly generate coordinate-level control commands for robotics, bypassing traditional code-based policies.

Contribution

It introduces a natural language reasoning method for robotics task planning that reduces reliance on intermediate code representations and enhances transferability to unseen tasks.

Findings

01

Natural language reasoning improves task success rates.

02

The approach enables transfer of skills to new, unseen tasks.

03

Prompt engineering significantly boosts performance.

Abstract

We demonstrate experimental results with LLMs that address robotics task planning problems. Recently, LLMs have been applied in robotics task planning, particularly using a code generation approach that converts complex high-level instructions into mid-level policy codes. In contrast, our approach acquires text descriptions of the task and scene objects, then formulates task planning through natural language reasoning, and outputs coordinate level control commands, thus reducing the necessity for intermediate representation code as policies with pre-defined APIs. Our approach is evaluated on a multi-modal prompt simulation benchmark, demonstrating that our prompt engineering experiments with natural language reasoning significantly enhance success rates compared to its absence. Furthermore, our approach illustrates the potential for natural language descriptions to transfer robotics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Logic, Reasoning, and Knowledge

MethodsLabel Smoothing · Cosine Annealing · Absolute Position Encodings · Linear Layer · Position-Wise Feed-Forward Layer · Transformer · GPT-4 · 15 Ways to Contact How can i speak to someone at Delta Airlines · {Dispute@FaQ-s}How to file a dispute with Expedia? · Weight Decay