Integrating Offline Reinforcement Learning with Transformers for   Sequential Recommendation

Xumei Xi; Yuke Zhao; Quan Liu; Liwen Ouyang; Yang Wu

arXiv:2307.14450·cs.IR·July 28, 2023

Integrating Offline Reinforcement Learning with Transformers for Sequential Recommendation

Xumei Xi, Yuke Zhao, Quan Liu, Liwen Ouyang, Yang Wu

PDF

Open Access

TL;DR

This paper introduces a novel offline reinforcement learning approach integrated with pre-trained transformers for sequential recommendation, achieving higher quality suggestions efficiently across different domains.

Contribution

It presents a fully offline RL framework using transformer-based models for sequential recommendation, which converges quickly and outperforms existing supervised methods.

Findings

01

Robust performance across e-commerce and movie datasets

02

Faster convergence compared to online RL methods

03

Higher recommendation quality than state-of-the-art supervised algorithms

Abstract

We consider the problem of sequential recommendation, where the current recommendation is made based on past interactions. This recommendation task requires efficient processing of the sequential data and aims to provide recommendations that maximize the long-term reward. To this end, we train a farsighted recommender by using an offline RL algorithm with the policy network in our model architecture that has been initialized from a pre-trained transformer model. The pre-trained model leverages the superb ability of the transformer to process sequential information. Compared to prior works that rely on online interaction via simulation, we focus on implementing a fully offline RL framework that is able to converge in a fast and stable way. Through extensive experiments on public datasets, we show that our method is robust across various recommendation regimes, including e-commerce and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Machine Learning in Healthcare

MethodsFocus