Scaling data-driven robotics with reward sketching and batch   reinforcement learning

Serkan Cabi; Sergio G\'omez Colmenarejo; Alexander Novikov; Ksenia; Konyushkova; Scott Reed; Rae Jeong; Konrad Zolna; Yusuf Aytar; David Budden,; Mel Vecerik; Oleg Sushkov; David Barker; Jonathan Scholz; Misha Denil; Nando; de Freitas; Ziyu Wang

arXiv:1909.12200·cs.RO·June 5, 2020·45 cites

Scaling data-driven robotics with reward sketching and batch reinforcement learning

Serkan Cabi, Sergio G\'omez Colmenarejo, Alexander Novikov, Ksenia, Konyushkova, Scott Reed, Rae Jeong, Konrad Zolna, Yusuf Aytar, David Budden,, Mel Vecerik, Oleg Sushkov, David Barker, Jonathan Scholz, Misha Denil, Nando, de Freitas, Ziyu Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a scalable data-driven robotics framework that leverages reward sketching and batch reinforcement learning to enable robots to learn multiple complex manipulation tasks from demonstrations and recorded experience.

Contribution

The paper presents a novel framework combining reward sketching with batch RL to scale data-driven robot learning across diverse tasks using large datasets.

Findings

01

Successfully applied to real robot for object stacking and cloth handling

02

Learned reward functions enable task generalization

03

Batch RL effectively trains policies from large datasets

Abstract

We present a framework for data-driven robotics that makes use of a large dataset of recorded robot experience and scales to several tasks using learned reward functions. We show how to apply this framework to accomplish three different object manipulation tasks on a real robot platform. Given demonstrations of a task together with task-agnostic recorded experience, we use a special form of human annotation as supervision to learn a reward function, which enables us to deal with real-world tasks where the reward signal cannot be acquired directly. Learned rewards are used in combination with a large dataset of experience from different tasks to learn a robot policy offline using batch RL. We show that using our approach it is possible to train agents to perform a variety of challenging manipulation tasks including stacking rigid objects and handling cloth.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deepmind/deepmind-research
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Human Pose and Action Recognition