RT-1: Robotics Transformer for Real-World Control at Scale

Anthony Brohan; Noah Brown; Justice Carbajal; Yevgen Chebotar; Joseph; Dabis; Chelsea Finn; Keerthana Gopalakrishnan; Karol Hausman; Alex Herzog,; Jasmine Hsu; Julian Ibarz; Brian Ichter; Alex Irpan; Tomas Jackson; Sally; Jesmonth; Nikhil J Joshi; Ryan Julian; Dmitry Kalashnikov; Yuheng Kuang,; Isabel Leal; Kuang-Huei Lee; Sergey Levine; Yao Lu; Utsav Malla; Deeksha; Manjunath; Igor Mordatch; Ofir Nachum; Carolina Parada; Jodilyn Peralta,; Emily Perez; Karl Pertsch; Jornell Quiambao; Kanishka Rao; Michael Ryoo,; Grecia Salazar; Pannag Sanketi; Kevin Sayed; Jaspiar Singh; Sumedh Sontakke,; Austin Stone; Clayton Tan; Huong Tran; Vincent Vanhoucke; Steve Vega; Quan; Vuong; Fei Xia; Ted Xiao; Peng Xu; Sichun Xu; Tianhe Yu; Brianna Zitkovich

arXiv:2212.06817·cs.RO·August 14, 2023·38 cites

RT-1: Robotics Transformer for Real-World Control at Scale

Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph, Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog,, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally, Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov

PDF

Open Access 1 Repo 1 Models 2 Datasets

TL;DR

This paper introduces the Robotics Transformer, a high-capacity model trained on large-scale, diverse robotic data, demonstrating improved generalization and scalability for real-world robotic control tasks.

Contribution

It presents a novel Transformer-based model architecture for robotics, emphasizing open-ended, task-agnostic training and scalability in data and model size.

Findings

01

The Robotics Transformer generalizes well across tasks with diverse data.

02

Model performance improves with increased data and model size.

03

Open-ended training enhances robotic control capabilities.

Abstract

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/robotics_transformer
tfOfficial

Models

🤗
LeTau/Minimal_VLA
model

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Machine Learning and Data Classification

MethodsMulti-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Softmax · Layer Normalization · Dropout · Byte Pair Encoding · Linear Layer