Residual Reward Models for Preference-based Reinforcement Learning
Chenyang Cao, Miguel Rogel-Garc\'ia, Mohamed Nabail, Xueqian Wang, Nicholas Rhinehart

TL;DR
This paper introduces Residual Reward Models (RRM) for preference-based reinforcement learning, which combine prior knowledge with learned rewards to improve convergence speed and performance in complex environments, including real robots.
Contribution
The paper proposes a novel Residual Reward Model framework that effectively integrates prior reward knowledge with learned preferences, enhancing RL performance and convergence speed.
Findings
RRM significantly improves RL performance across multiple tasks.
The method accelerates policy learning in real robot experiments.
RRM is effective with various types of prior rewards, including IRL and proxy rewards.
Abstract
Preference-based Reinforcement Learning (PbRL) provides a way to learn high-performance policies in environments where the reward signal is hard to specify, avoiding heuristic and time-consuming reward design. However, PbRL can suffer from slow convergence speed since it requires training in a reward model. Prior work has proposed learning a reward model from demonstrations and fine-tuning it using preferences. However, when the model is a neural network, using different loss functions for pre-training and fine-tuning can pose challenges to reliable optimization. In this paper, we propose a method to effectively leverage prior knowledge with a Residual Reward Model (RRM). An RRM assumes that the true reward of the environment can be split into a sum of two parts: a prior reward and a learned reward. The prior reward is a term available before training, for example, a user's ``best…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
