UnifiedRL: A Reinforcement Learning Algorithm Tailored for Multi-Task Fusion in Large-Scale Recommender Systems
Peng Liu, Cong Xu, Ming Zhao, Jiawei Zhu, Bin Wang, Yi Ren

TL;DR
UnifiedRL is a novel reinforcement learning algorithm designed specifically for multi-task fusion in large-scale recommender systems, addressing key limitations of existing methods and demonstrating significant performance improvements in real-world deployment.
Contribution
UnifiedRL introduces a seamless integration of offline RL with a custom exploration policy, enabling efficient online exploration and superior multi-task fusion in recommender systems.
Findings
Achieved +4.64% increase in user valid consumption
Achieved +1.74% increase in user duration time
Successfully deployed in multiple large-scale RSs since June 2023
Abstract
As the last pivotal stage of Recommender System (RS), Multi-Task Fusion (MTF) is responsible for combining multiple scores outputted by Multi-Task Learning (MTL) model into a final score to maximize user satisfaction. Recently, to optimize long-term user satisfaction, Reinforcement Learning (RL) is used for MTF in RSs. However, the existing offline RL algorithms used for MTF have the following severe problems: a) To avoid Out-of-Distribution (OOD), their constraints are overly strict, which seriously damage performance; b) They are unaware of the exploration policy used to collect training data, only suboptimal policy can be learned; c) Their exploration policies are inefficient and hurt user experience. To solve the above problems, we propose an innovative method called UnifiedRL tailored for MTF in large-scale RSs. UnifiedRL seamlessly integrates offline RL model with its custom…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Reinforcement Learning in Robotics
