Constrained Reinforcement Learning for Short Video Recommendation

Qingpeng Cai; Ruohan Zhan; Chi Zhang; Jie Zheng; Guangwei Ding,; Pinghua Gong; Dong Zheng; Peng Jiang

arXiv:2205.13248·cs.LG·May 27, 2022·5 cites

Constrained Reinforcement Learning for Short Video Recommendation

Qingpeng Cai, Ruohan Zhan, Chi Zhang, Jie Zheng, Guangwei Ding,, Pinghua Gong, Dong Zheng, Peng Jiang

PDF

Open Access

TL;DR

This paper introduces a constrained reinforcement learning framework for short video recommendation that balances optimizing user watch time with auxiliary interactions, demonstrating superior performance in simulations and live platform deployment.

Contribution

It proposes a novel two-stage actor-critic reinforcement learning method for constrained MDPs tailored to short video recommendation, effectively balancing multiple user response objectives.

Findings

01

Outperforms baseline methods in simulations for main and auxiliary objectives.

02

Significantly improves user watch time and interactions in live experiments.

03

Successfully deployed in a production system for real-world short video recommendations.

Abstract

The wide popularity of short videos on social media poses new opportunities and challenges to optimize recommender systems on the video-sharing platforms. Users provide complex and multi-faceted responses towards recommendations, including watch time and various types of interactions with videos. As a result, established recommendation algorithms that concern a single objective are not adequate to meet this new demand of optimizing comprehensive user experiences. In this paper, we formulate the problem of short video recommendation as a constrained Markov Decision Process (MDP), where platforms want to optimize the main goal of user watch time in long term, with the constraint of accommodating the auxiliary responses of user interactions such as sharing/downloading videos. To solve the constrained MDP, we propose a two-stage reinforcement learning approach based on actor-critic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · Digital Games and Media