Session-Level Dynamic Ad Load Optimization using Offline Robust   Reinforcement Learning

Tao Liu; Qi Xu; Wei Shi; Zhigang Hua; Shuang Yang

arXiv:2501.05591·cs.LG·January 13, 2025

Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning

Tao Liu, Qi Xu, Wei Shi, Zhigang Hua, Shuang Yang

PDF

Open Access

TL;DR

This paper presents an offline robust reinforcement learning framework for session-level ad load optimization that effectively balances user experience and monetization, demonstrating significant offline and online performance improvements.

Contribution

The paper introduces a novel offline robust dueling DQN approach that mitigates confounding bias and enhances stability against distribution shifts in ad load optimization.

Findings

01

Over 80% offline gains over causal learning baselines

02

Additional 5% offline gains with robustness enhancements

03

Double-digit improvements in online engagement-ad score trade-off

Abstract

Session-level dynamic ad load optimization aims to personalize the density and types of delivered advertisements in real time during a user's online session by dynamically balancing user experience quality and ad monetization. Traditional causal learning-based approaches struggle with key technical challenges, especially in handling confounding bias and distribution shifts. In this paper, we develop an offline deep Q-network (DQN)-based framework that effectively mitigates confounding bias in dynamic systems and demonstrates more than 80% offline gains compared to the best causal learning-based production baseline. Moreover, to improve the framework's robustness against unanticipated distribution shifts, we further enhance our framework with a novel offline robust dueling DQN approach. This approach achieves more stable rewards on multiple OpenAI-Gym datasets as perturbations increase,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIterative Learning Control Systems · VLSI and FPGA Design Techniques · Scheduling and Optimization Algorithms

MethodsQ-Learning · Convolution · Dense Connections · Deep Q-Network