SR-Reward: Taking The Path More Traveled

Seyed Mahdi B. Azad; Zahra Padar; Gabriel Kalweit; Joschka Boedecker

arXiv:2501.02330·cs.LG·June 13, 2025

SR-Reward: Taking The Path More Traveled

Seyed Mahdi B. Azad, Zahra Padar, Gabriel Kalweit, Joschka Boedecker

PDF

Open Access

TL;DR

This paper introduces SR-Reward, a novel method for learning reward functions from offline demonstrations using successor representations, which improves stability and robustness in offline reinforcement learning.

Contribution

The paper presents SR-Reward, a successor representation-based reward learning method that decouples reward from policy, enabling stable offline RL without adversarial training.

Findings

01

Achieves competitive results on D4RL benchmark.

02

Enhances robustness with negative sampling strategy.

03

Reveals advantages and limitations through ablation studies.

Abstract

In this paper, we propose a novel method for learning reward functions directly from offline demonstrations. Unlike traditional inverse reinforcement learning (IRL), our approach decouples the reward function from the learner's policy, eliminating the adversarial interaction typically required between the two. This results in a more stable and efficient training process. Our reward function, called \textit{SR-Reward}, leverages successor representation (SR) to encode a state based on expected future states' visitation under the demonstration policy and transition dynamics. By utilizing the Bellman equation, SR-Reward can be learned concurrently with most reinforcement learning (RL) algorithms without altering the existing training pipeline. We also introduce a negative sampling strategy to mitigate overestimation errors by reducing rewards for out-of-distribution data, thereby enhancing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning