Supervised Learning-enhanced Multi-Group Actor Critic for Live Stream Allocation in Feed

Jingxin Liu; Xiang Gao; Yisha Li; Xin Li; Haiyang Lu; Ben Wang

arXiv:2412.10381·cs.IR·May 27, 2025

Supervised Learning-enhanced Multi-Group Actor Critic for Live Stream Allocation in Feed

Jingxin Liu, Xiang Gao, Yisha Li, Xin Li, Haiyang Lu, Ben Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel reinforcement learning algorithm, SL-MGAC, that improves live stream recommendation by enhancing stability and reducing variance, leading to better long-term user engagement in feed systems.

Contribution

The paper proposes a supervised learning-enhanced multi-group actor-critic algorithm with variance reduction and a new reward function for stable, effective live stream allocation in recommendation systems.

Findings

01

Outperforms baseline methods in offline evaluations.

02

Demonstrates improved stability in online A/B tests.

03

Effectively balances long-term engagement and allocation greediness.

Abstract

In the context of a short video & live stream mixed recommendation scenario, the live stream recommendation system (RS) decides whether to allocate at most one live stream into the video feed for each user request. To maximize long-term user engagement, it is crucial to determine an optimal live stream policy for accurate live stream allocation. The inappropriate live stream allocation policy can significantly affect the duration of the usage app and user retention, which ignores the long-term negative impact of live stream allocation. Recently, reinforcement learning (RL) has been widely applied in recommendation systems to capture long-term user engagement. However, traditional RL algorithms often face divergence and instability problems, which restricts the application and deployment in the large-scale industrial recommendation systems, especially in the aforementioned challenging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

frankg1/SL-MGAC-torch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Advanced Bandit Algorithms Research