Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos
Annie S. Chen, Suraj Nair, Chelsea Finn

TL;DR
This paper introduces a domain-agnostic video discriminator that learns reward functions from human videos, enabling robots to generalize task success metrics across diverse environments and tasks with minimal robot data.
Contribution
The work presents a novel approach to learning robotic reward functions from in-the-wild human videos, enhancing zero-shot generalization to new environments and tasks.
Findings
Zero-shot generalization to unseen environments
Zero-shot generalization to unseen tasks
Successful manipulation on real robot from a single human demo
Abstract
We are motivated by the goal of generalist robots that can complete a wide range of tasks across many environments. Critical to this is the robot's ability to acquire some metric of task success or reward, which is necessary for reinforcement learning, planning, or knowing when to ask for help. For a general-purpose robot operating in the real world, this reward function must also be able to generalize broadly across environments, tasks, and objects, while depending only on on-board sensor observations (e.g. RGB images). While deep learning on large and diverse datasets has shown promise as a path towards such generalization in computer vision and natural language, collecting high quality datasets of robotic interaction at scale remains an open challenge. In contrast, "in-the-wild" videos of humans (e.g. YouTube) contain an extensive collection of people doing interesting tasks across a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
