Sustainable Online Reinforcement Learning for Auto-bidding

Zhiyu Mou; Yusen Huo; Rongquan Bai; Mingzhou Xie; Chuan Yu; Jian Xu,; Bo Zheng

arXiv:2210.07006·cs.LG·October 14, 2022·5 cites

Sustainable Online Reinforcement Learning for Auto-bidding

Zhiyu Mou, Yusen Huo, Rongquan Bai, Mingzhou Xie, Chuan Yu, Jian Xu,, Bo Zheng

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a sustainable online reinforcement learning framework for auto-bidding in advertising, directly interacting with real systems to improve safety and effectiveness, addressing the gap between virtual and real environments.

Contribution

It proposes a novel online RL framework with a safe exploration policy and variance-suppressed conservative Q-learning, overcoming the offline-online discrepancy in auto-bidding.

Findings

01

Outperforms state-of-the-art auto-bidding algorithms in simulations

02

Demonstrates effectiveness in real-world advertising systems

03

Provides theoretical safety guarantees for the exploration policy

Abstract

Recently, auto-bidding technique has become an essential tool to increase the revenue of advertisers. Facing the complex and ever-changing bidding environments in the real-world advertising system (RAS), state-of-the-art auto-bidding policies usually leverage reinforcement learning (RL) algorithms to generate real-time bids on behalf of the advertisers. Due to safety concerns, it was believed that the RL training process can only be carried out in an offline virtual advertising system (VAS) that is built based on the historical data generated in the RAS. In this paper, we argue that there exists significant gaps between the VAS and RAS, making the RL training process suffer from the problem of inconsistency between online and offline (IBOO). Firstly, we formally define the IBOO and systematically analyze its causes and influences. Then, to avoid the IBOO, we propose a sustainable online…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nobodymx/sorl-for-auto-bidding
pytorchOfficial

Videos

Sustainable Online Reinforcement Learning for Auto-bidding· slideslive

Taxonomy

TopicsAuction Theory and Applications · FinTech, Crowdfunding, Digital Finance · Consumer Market Behavior and Pricing

MethodsQ-Learning