ResAct: Reinforcing Long-term Engagement in Sequential Recommendation   with Residual Actor

Wanqi Xue; Qingpeng Cai; Ruohan Zhan; Dong Zheng; Peng Jiang; Kun Gai,; Bo An

arXiv:2206.02620·cs.IR·June 19, 2023·6 cites

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

Wanqi Xue, Qingpeng Cai, Ruohan Zhan, Dong Zheng, Peng Jiang, Kun Gai,, Bo An

PDF

Open Access 1 Repo 1 Video

TL;DR

ResAct is a reinforcement learning approach that improves long-term engagement in sequential recommendation by reconstructing online behaviors and using a residual actor, achieving superior results on large-scale datasets.

Contribution

It introduces ResAct, a novel RL method that avoids online exploration by optimizing near the online policy with a residual actor and uses information-theoretical regularizers for feature extraction.

Findings

01

ResAct significantly outperforms state-of-the-art baselines.

02

It effectively estimates long-term engagement without online exploration.

03

The method demonstrates strong performance on large-scale industrial data.

Abstract

Long-term engagement is preferred over immediate engagement in sequential recommendation as it directly affects product operational metrics such as daily active users (DAUs) and dwell time. Meanwhile, reinforcement learning (RL) is widely regarded as a promising framework for optimizing long-term engagement in sequential recommendation. However, due to expensive online interactions, it is very difficult for RL algorithms to perform state-action value estimation, exploration and feature extraction when optimizing long-term engagement. In this paper, we propose ResAct which seeks a policy that is close to, but better than, the online-serving policy. In this way, we can collect sufficient data near the learned policy so that state-action values can be properly estimated, and there is no need to perform online exploration. ResAct optimizes the policy by first reconstructing the online…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chongminggao/easyrl4rec
pytorch

Videos

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor· slideslive

Taxonomy

TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Smart Grid Energy Management