VIP: Towards Universal Visual Reward and Representation via   Value-Implicit Pre-Training

Yecheng Jason Ma; Shagun Sodhani; Dinesh Jayaraman; Osbert Bastani,; Vikash Kumar; Amy Zhang

arXiv:2210.00030·cs.RO·March 8, 2023·35 cites

VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training

Yecheng Jason Ma, Shagun Sodhani, Dinesh Jayaraman, Osbert Bastani,, Vikash Kumar, Amy Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

VIP introduces a self-supervised pre-training method using human videos to generate dense, smooth reward functions for unseen robotic tasks, enabling effective reward-based control without task-specific data.

Contribution

The paper presents VIP, a novel implicit time contrastive objective for visual representation pre-training that produces reward functions from human videos without action labels.

Findings

01

VIP outperforms prior representations on robotic control tasks.

02

VIP enables few-shot offline RL with minimal trajectories.

03

VIP works effectively on both simulated and real robots.

Abstract

Reward and representation learning are two long-standing challenges for learning an expanding set of robot manipulation skills from sensory observations. Given the inherent cost and scarcity of in-domain, task-specific robot data, learning from large, diverse, offline human videos has emerged as a promising path towards acquiring a generally useful visual representation for control; however, how these human videos can be used for general-purpose reward learning remains an open question. We introduce $V$ alue- $I$ mplicit $P$ re-training (VIP), a self-supervised pre-trained visual representation capable of generating dense and smooth reward functions for unseen robotic tasks. VIP casts representation learning from human videos as an offline goal-conditioned reinforcement learning problem and derives a self-supervised dual goal-conditioned value-function objective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/vip
pytorchOfficial

Videos

VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training· slideslive

Taxonomy

TopicsNeuroinflammation and Neurodegeneration Mechanisms · Neural dynamics and brain function · Reinforcement Learning in Robotics