From Play to Policy: Conditional Behavior Generation from Uncurated   Robot Data

Zichen Jeff Cui; Yibin Wang; Nur Muhammad Mahi Shafiullah; Lerrel; Pinto

arXiv:2210.10047·cs.RO·December 19, 2022·6 cites

From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data

Zichen Jeff Cui, Yibin Wang, Nur Muhammad Mahi Shafiullah, Lerrel, Pinto

PDF

Open Access 3 Datasets 1 Video

TL;DR

This paper introduces Conditional Behavior Transformers (C-BeT), a novel method for learning task-centric robot behaviors from noisy, uncurated play data, achieving significant improvements over prior methods in simulation and real-world settings.

Contribution

The paper presents C-BeT, a new approach that combines multi-modal generation with goal conditioning to learn from unstructured robot play data without task labels.

Findings

01

C-BeT improves performance by 45.7% over state-of-the-art in simulated tasks.

02

It enables learning useful behaviors on real robots solely from play data.

03

The method works without explicit task labels or reward signals.

Abstract

While large-scale sequence modeling from offline data has led to impressive performance gains in natural language and image generation, directly translating such ideas to robotics has been challenging. One critical reason for this is that uncurated robot demonstration data, i.e. play data, collected from non-expert human demonstrators are often noisy, diverse, and distributionally multi-modal. This makes extracting useful, task-centric behaviors from such data a difficult generative modeling problem. In this work, we present Conditional Behavior Transformers (C-BeT), a method that combines the multi-modal generation ability of Behavior Transformer with future-conditioned goal specification. On a suite of simulated benchmark tasks, we find that C-BeT improves upon prior state-of-the-art work in learning from play data by an average of 45.7%. Further, we demonstrate for the first time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data· slideslive

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Reinforcement Learning in Robotics

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Softmax · Adam · Byte Pair Encoding · Absolute Position Encodings · Layer Normalization