Enhancing the LLM-Based Robot Manipulation Through Human-Robot   Collaboration

Haokun Liu; Yaonan Zhu; Kenji Kato; Atsushi Tsukahara; Izumi Kondo,; Tadayoshi Aoyama; and Yasuhisa Hasegawa

arXiv:2406.14097·cs.RO·July 2, 2024

Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration

Haokun Liu, Yaonan Zhu, Kenji Kato, Atsushi Tsukahara, Izumi Kondo,, Tadayoshi Aoyama, and Yasuhisa Hasegawa

PDF

TL;DR

This paper introduces a human-robot collaboration framework that enhances LLM-based robot manipulation by decomposing commands, integrating perception, and learning from human guidance, enabling complex tasks in real-world environments.

Contribution

It presents a novel system combining GPT-4, visual perception, and human guidance to improve autonomous manipulation capabilities of LLM-based robots.

Findings

01

Improved task performance with complex trajectory planning.

02

Effective learning from human demonstrations.

03

Successful real-world manipulation experiments.

Abstract

Large Language Models (LLMs) are gaining popularity in the field of robotics. However, LLM-based robots are limited to simple, repetitive motions due to the poor integration between language models, robots, and the environment. This paper proposes a novel approach to enhance the performance of LLM-based autonomous manipulation through Human-Robot Collaboration (HRC). The approach involves using a prompted GPT-4 language model to decompose high-level language commands into sequences of motions that can be executed by the robot. The system also employs a YOLO-based perception algorithm, providing visual cues to the LLM, which aids in planning feasible motions within the specific environment. Additionally, an HRC method is proposed by combining teleoperation and Dynamic Movement Primitives (DMP), allowing the LLM-based robot to learn from human guidance. Real-world experiments have been…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam · Linear Layer · Absolute Position Encodings