RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System
Kai Wang, Zhene Zou, Minghao Zhao, Qilin Deng, Yue Shang, Yile Liang,, Runze Wu, Xudong Shen, Tangjie Lyu, Changjie Fan

TL;DR
This paper introduces RL4RS, the first real-world dataset for reinforcement learning-based recommender systems, along with a comprehensive evaluation framework to address the reality gap in RL research for recommendation tasks.
Contribution
The paper provides the RL4RS dataset, tools, and evaluation framework to bridge the gap between academic RL research and real-world recommender system deployment.
Findings
RL4RS dataset enables realistic RL research in RS.
Evaluation framework improves validation of RL-based RS.
Baseline algorithms demonstrate the dataset's utility.
Abstract
Reinforcement learning based recommender systems (RL-based RS) aim at learning a good policy from a batch of collected data, by casting recommendations to multi-step decision-making tasks. However, current RL-based RS research commonly has a large reality gap. In this paper, we introduce the first open-source real-world dataset, RL4RS, hoping to replace the artificial datasets and semi-simulated RS datasets previous studies used due to the resource limitation of the RL-based RS domain. Unlike academic RL research, RL-based RS suffers from the difficulties of being well-validated before deployment. We attempt to propose a new systematic evaluation framework, including evaluation of environment simulation, evaluation on environments, counterfactual policy evaluation, and evaluation on environments built from test set. In summary, the RL4RS (Reinforcement Learning for Recommender Systems),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsTest
