On-Robot Reinforcement Learning with Goal-Contrastive Rewards

Ondrej Biza; Thomas Weng; Lingfeng Sun; Karl Schmeckpeper; Tarik Kelestemur; Yecheng Jason Ma; Robert Platt; Jan-Willem van de Meent; Lawson L. S. Wong

arXiv:2410.19989·cs.RO·May 16, 2025

On-Robot Reinforcement Learning with Goal-Contrastive Rewards

Ondrej Biza, Thomas Weng, Lingfeng Sun, Karl Schmeckpeper, Tarik Kelestemur, Yecheng Jason Ma, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong

PDF

Open Access

TL;DR

This paper introduces GCR, a novel reward learning method from passive videos that improves sample efficiency in robot reinforcement learning and enables cross-embodiment transfer, reducing the need for manual reward engineering.

Contribution

GCR is a new dense reward learning approach from passive videos that enhances RL efficiency and supports transfer across different embodiments.

Findings

01

GCR doubles the number of solvable tasks compared to baseline methods.

02

GCR enables effective cross-embodiment transfer from human and robot videos.

03

GCR improves sample efficiency in simulated and real robot environments.

Abstract

Reinforcement Learning (RL) has the potential to enable robots to learn from their own actions in the real world. Unfortunately, RL can be prohibitively expensive, in terms of on-robot runtime, due to inefficient exploration when learning from a sparse reward signal. Designing dense reward functions is labour-intensive and requires domain expertise. In our work, we propose GCR (Goal-Contrastive Rewards), a dense reward function learning method that can be trained on passive video demonstrations. By using videos without actions, our method is easier to scale, as we can use arbitrary videos. GCR combines two loss functions, an implicit value loss function that models how the reward increases when traversing a successful trajectory, and a goal-contrastive loss that discriminates between successful and failed trajectories. We perform experiments in simulated manipulation environments across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics