Learning Reward Functions for Robotic Manipulation by Observing Humans

Minttu Alakuijala; Gabriel Dulac-Arnold; Julien Mairal; Jean Ponce and; Cordelia Schmid

arXiv:2211.09019·cs.RO·March 8, 2023

Learning Reward Functions for Robotic Manipulation by Observing Humans

Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce and, Cordelia Schmid

PDF

Open Access

TL;DR

This paper introduces a method to learn general reward functions for robotic manipulation by observing unlabeled human videos, enabling robots to better explore and learn tasks without task-specific demonstrations.

Contribution

The work presents a novel approach to derive task-agnostic reward functions from human videos, generalizing across robot embodiments and environments without requiring task-specific data.

Findings

01

The learned reward function generalizes to unseen robot and environment configurations.

02

The method accelerates reinforcement learning for manipulation tasks in simulation.

03

It does not require task-specific human demonstrations or predefined correspondences.

Abstract

Observing a human demonstrator manipulate objects provides a rich, scalable and inexpensive source of data for learning robotic policies. However, transferring skills from human videos to a robotic manipulator poses several challenges, not least a difference in action and observation spaces. In this work, we use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies. Thanks to the diversity of this training data, the learned reward function sufficiently generalizes to image observations from a previously unseen robot embodiment and environment to provide a meaningful prior for directed exploration in reinforcement learning. We propose two methods for scoring states relative to a goal image: through direct temporal regression, and through distances in an embedding space obtained with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning