RoboGPT: an intelligent agent of making embodied long-term decisions for   daily instruction tasks

Yaran Chen; Wenbo Cui; Yuanwen Chen; Mining Tan; Xinyao Zhang; Dongbin; Zhao; He Wang

arXiv:2311.15649·cs.RO·September 16, 2024·2 cites

RoboGPT: an intelligent agent of making embodied long-term decisions for daily instruction tasks

Yaran Chen, Wenbo Cui, Yuanwen Chen, Mining Tan, Xinyao Zhang, Dongbin, Zhao, He Wang

PDF

Open Access

TL;DR

RoboGPT is a novel robotic agent that combines large language models with re-planning and specialized skills to improve long-term decision-making and task execution in daily instruction tasks, outperforming existing methods.

Contribution

The paper introduces RoboGPT, integrating LLM-based planning with a re-plan module and specialized RoboSkills, supported by a new robotic dataset, to enhance feasibility and adaptability in robotic task planning.

Findings

01

Outperforms SOTA on ALFRED daily tasks

02

Exceeds SOTA LLM planners in task rationality

03

Generalizes well to unseen tasks

Abstract

Robotic agents must master common sense and long-term sequential decisions to solve daily tasks through natural language instruction. The developments in Large Language Models (LLMs) in natural language processing have inspired efforts to use LLMs in complex robot planning. Despite LLMs' great generalization and comprehension of instruction tasks, LLMs-generated task plans sometimes lack feasibility and correctness. To address the problem, we propose a RoboGPT agent\footnote{our code and dataset will be released soon} for making embodied long-term decisions for daily tasks, with two modules: 1) LLMs-based planning with re-plan to break the task into multiple sub-goals; 2) RoboSkill individually designed for sub-goals to learn better navigation and manipulation skills. The LLMs-based planning is enhanced with a new robotic dataset and re-plan, called RoboGPT. The new robotic dataset of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications