ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems

Yi Zhang; Ruihong Qiu; Jiajun Liu; Sen Wang

arXiv:2407.13163·cs.IR·May 13, 2025

ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems

Yi Zhang, Ruihong Qiu, Jiajun Liu, Sen Wang

PDF

Open Access 1 Repo

TL;DR

ROLeR introduces a novel reward shaping method and improved uncertainty estimation for offline reinforcement learning in recommender systems, leading to state-of-the-art performance on benchmark datasets.

Contribution

The paper proposes ROLeR, a new approach that enhances reward modeling and uncertainty estimation in model-based offline RL for recommender systems.

Findings

01

ROLeR outperforms existing baselines on four benchmark datasets.

02

The non-parametric reward shaping improves reward model accuracy.

03

Enhanced uncertainty penalties lead to better recommendation performance.

Abstract

Offline reinforcement learning (RL) is an effective tool for real-world recommender systems with its capacity to model the dynamic interest of users and its interactive nature. Most existing offline RL recommender systems focus on model-based RL through learning a world model from offline data and building the recommendation policy by interacting with this model. Although these methods have made progress in the recommendation performance, the effectiveness of model-based offline RL methods is often constrained by the accuracy of the estimation of the reward model and the model uncertainties, primarily due to the extreme discrepancy between offline logged data and real-world data in user interactions with online platforms. To fill this gap, a more accurate reward model and uncertainty estimation are needed for the model-based RL methods. In this paper, a novel model-based Reward Shaping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ArronDZhang/ROLeR
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsFocus