Rethinking Reinforcement Learning for Recommendation: A Prompt   Perspective

Xin Xin; Tiago Pimentel; Alexandros Karatzoglou; Pengjie Ren,; Konstantina Christakopoulou; Zhaochun Ren

arXiv:2206.07353·cs.IR·June 16, 2022·5 cites

Rethinking Reinforcement Learning for Recommendation: A Prompt Perspective

Xin Xin, Tiago Pimentel, Alexandros Karatzoglou, Pengjie Ren,, Konstantina Christakopoulou, Zhaochun Ren

PDF

Open Access

TL;DR

This paper introduces Prompt-Based Reinforcement Learning (PRL), a novel offline training paradigm for recommendation systems that predicts items based on historical data and reward prompts, avoiding online exploration errors.

Contribution

It proposes a new offline RL training approach for recommendation agents that directly infers actions from state-reward prompts, simplifying training with supervised learning.

Findings

01

PRL outperforms traditional RL methods on real-world datasets.

02

The approach simplifies offline training by using supervised learning.

03

Experimental results demonstrate superior recommendation accuracy.

Abstract

Modern recommender systems aim to improve user experience. As reinforcement learning (RL) naturally fits this objective -- maximizing an user's reward per session -- it has become an emerging topic in recommender systems. Developing RL-based recommendation methods, however, is not trivial due to the \emph{offline training challenge}. Specifically, the keystone of traditional RL is to train an agent with large amounts of online exploration making lots of `errors' in the process. In the recommendation setting, though, we cannot afford the price of making `errors' online. As a result, the agent needs to be trained through offline historical implicit feedback, collected under different recommendation policies; traditional RL algorithms may lead to sub-optimal policies under these offline training settings. Here we propose a new learning paradigm -- namely Prompt-Based Reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Smart Grid Energy Management