Rank2Reward: Learning Shaped Reward Functions from Passive Video
Daniel Yang, Davin Tjia, Jacob Berg, Dima Damen, Pulkit Agrawal,, Abhishek Gupta

TL;DR
Rank2Reward is a novel method that learns reward functions from passive videos to guide robot reinforcement learning without requiring low-level state or action data, enabling scalable skill acquisition from visual demonstrations.
Contribution
It introduces Rank2Reward, a technique that infers reward functions by ranking video frames, facilitating learning from raw videos without low-level data or explicit reward engineering.
Findings
Successfully learned behaviors from raw videos in simulation and real-world tasks.
Effectively extended to large-scale web video datasets.
Guided reinforcement learning without exploiting the reward function.
Abstract
Teaching robots novel skills with demonstrations via human-in-the-loop data collection techniques like kinesthetic teaching or teleoperation puts a heavy burden on human supervisors. In contrast to this paradigm, it is often significantly easier to provide raw, action-free visual data of tasks being performed. Moreover, this data can even be mined from video datasets or the web. Ideally, this data can serve to guide robot learning for new tasks in novel environments, informing both "what" to do and "how" to do it. A powerful way to encode both the "what" and the "how" is to infer a well-shaped reward function for reinforcement learning. The challenge is determining how to ground visual demonstration inputs into a well-shaped and informative reward function. We propose a technique Rank2Reward for learning behaviors from videos of tasks being performed without access to any low-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Music and Audio Processing · Video Analysis and Summarization
