Multi-Task Fusion via Reinforcement Learning for Long-Term User   Satisfaction in Recommender Systems

Qihua Zhang; Junning Liu; Yuzhuo Dai; Yiyan Qi; Yifan Yuan; Kunlun; Zheng; Fan Huang; Xianfeng Tan

arXiv:2208.04560·cs.IR·August 11, 2022

Multi-Task Fusion via Reinforcement Learning for Long-Term User Satisfaction in Recommender Systems

Qihua Zhang, Junning Liu, Yuzhuo Dai, Yiyan Qi, Yifan Yuan, Kunlun, Zheng, Fan Huang, Xianfeng Tan

PDF

TL;DR

This paper introduces a reinforcement learning-based framework for multi-task fusion in recommender systems, optimizing long-term user satisfaction through offline and online learning, and demonstrates its effectiveness on large-scale real-world data.

Contribution

It proposes a novel Batch RL approach for multi-task fusion in recommender systems, addressing long-term satisfaction and online exploration, with successful large-scale deployment.

Findings

01

Effective long-term user satisfaction optimization

02

Successful deployment on a large-scale industrial platform

03

Outperforms baseline models in real-world experiments

Abstract

Recommender System (RS) is an important online application that affects billions of users every day. The mainstream RS ranking framework is composed of two parts: a Multi-Task Learning model (MTL) that predicts various user feedback, i.e., clicks, likes, sharings, and a Multi-Task Fusion model (MTF) that combines the multi-task outputs into one final ranking score with respect to user satisfaction. There has not been much research on the fusion model while it has great impact on the final recommendation as the last crucial process of the ranking. To optimize long-term user satisfaction rather than obtain instant returns greedily, we formulate MTF task as Markov Decision Process (MDP) within a recommendation session and propose a Batch Reinforcement Learning (RL) based Multi-Task Fusion framework (BatchRL-MTF) that includes a Batch RL framework and an online exploration. The former…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest