Rank2Reward: Learning Shaped Reward Functions from Passive Video

Daniel Yang; Davin Tjia; Jacob Berg; Dima Damen; Pulkit Agrawal,; Abhishek Gupta

arXiv:2404.14735·cs.RO·April 24, 2024

Rank2Reward: Learning Shaped Reward Functions from Passive Video

Daniel Yang, Davin Tjia, Jacob Berg, Dima Damen, Pulkit Agrawal,, Abhishek Gupta

PDF

Open Access

TL;DR

Rank2Reward is a novel method that learns reward functions from passive videos to guide robot reinforcement learning without requiring low-level state or action data, enabling scalable skill acquisition from visual demonstrations.

Contribution

It introduces Rank2Reward, a technique that infers reward functions by ranking video frames, facilitating learning from raw videos without low-level data or explicit reward engineering.

Findings

01

Successfully learned behaviors from raw videos in simulation and real-world tasks.

02

Effectively extended to large-scale web video datasets.

03

Guided reinforcement learning without exploiting the reward function.

Abstract

Teaching robots novel skills with demonstrations via human-in-the-loop data collection techniques like kinesthetic teaching or teleoperation puts a heavy burden on human supervisors. In contrast to this paradigm, it is often significantly easier to provide raw, action-free visual data of tasks being performed. Moreover, this data can even be mined from video datasets or the web. Ideally, this data can serve to guide robot learning for new tasks in novel environments, informing both "what" to do and "how" to do it. A powerful way to encode both the "what" and the "how" is to infer a well-shaped reward function for reinforcement learning. The challenge is determining how to ground visual demonstration inputs into a well-shaped and informative reward function. We propose a technique Rank2Reward for learning behaviors from videos of tasks being performed without access to any low-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Music and Audio Processing · Video Analysis and Summarization