Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human   Videos

Annie S. Chen; Suraj Nair; Chelsea Finn

arXiv:2103.16817·cs.RO·April 1, 2021

Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos

Annie S. Chen, Suraj Nair, Chelsea Finn

PDF

TL;DR

This paper introduces a domain-agnostic video discriminator that learns reward functions from human videos, enabling robots to generalize task success metrics across diverse environments and tasks with minimal robot data.

Contribution

The work presents a novel approach to learning robotic reward functions from in-the-wild human videos, enhancing zero-shot generalization to new environments and tasks.

Findings

01

Zero-shot generalization to unseen environments

02

Zero-shot generalization to unseen tasks

03

Successful manipulation on real robot from a single human demo

Abstract

We are motivated by the goal of generalist robots that can complete a wide range of tasks across many environments. Critical to this is the robot's ability to acquire some metric of task success or reward, which is necessary for reinforcement learning, planning, or knowing when to ask for help. For a general-purpose robot operating in the real world, this reward function must also be able to generalize broadly across environments, tasks, and objects, while depending only on on-board sensor observations (e.g. RGB images). While deep learning on large and diverse datasets has shown promise as a path towards such generalization in computer vision and natural language, collecting high quality datasets of robotic interaction at scale remains an open challenge. In contrast, "in-the-wild" videos of humans (e.g. YouTube) contain an extensive collection of people doing interesting tasks across a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.