Learning List-wise Representation in Reinforcement Learning for Ads   Allocation with Multiple Auxiliary Tasks

Ze Wang; Guogang Liao; Xiaowen Shi; Xiaoxu Wu; Chuheng Zhang; Yongkang; Wang; Xingxing Wang; Dong Wang

arXiv:2204.00888·cs.LG·August 12, 2022

Learning List-wise Representation in Reinforcement Learning for Ads Allocation with Multiple Auxiliary Tasks

Ze Wang, Guogang Liao, Xiaowen Shi, Xiaoxu Wu, Chuheng Zhang, Yongkang, Wang, Xingxing Wang, Dong Wang

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning approach for ads allocation that learns improved list-wise representations using auxiliary tasks, resulting in higher revenue and better generalization in recommendation systems.

Contribution

It proposes a novel RL method with auxiliary tasks for list-wise ads allocation, enhancing representation learning and sample efficiency.

Findings

01

Improved list-wise representations lead to higher platform revenue.

02

Auxiliary tasks significantly enhance RL agent performance.

03

Method outperforms state-of-the-art baselines in experiments.

Abstract

With the recent prevalence of reinforcement learning (RL), there have been tremendous interests in utilizing RL for ads allocation in recommendation platforms (e.g., e-commerce and news feed sites). To achieve better allocation, the input of recent RL-based ads allocation methods is upgraded from point-wise single item to list-wise item arrangement. However, this also results in a high-dimensional space of state-action pairs, making it difficult to learn list-wise representations with good generalization ability. This further hinders the exploration of RL agents and causes poor sample efficiency. To address this problem, we propose a novel RL-based approach for ads allocation which learns better list-wise representations by leveraging task-specific signals on Meituan food delivery platform. Specifically, we propose three different auxiliary tasks based on reconstruction, prediction, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Recommender Systems and Techniques · Advanced Bandit Algorithms Research

MethodsContrastive Learning