Ad-load Balancing via Off-policy Learning in a Content Marketplace
Hitesh Sagtani, Madan Jhawar, Rishabh Mehrotra, Olivier Jeunen

TL;DR
This paper introduces an off-policy learning framework for ad-load balancing in social media platforms, optimizing user satisfaction and revenue by leveraging logged bandit feedback and unbiased estimators, demonstrated at scale.
Contribution
It presents a novel off-policy learning approach using IPS and DR estimators for ad-load balancing, addressing user heterogeneity and session context.
Findings
Significant improvements in user satisfaction metrics.
Increased ads revenue observed in large-scale deployment.
Effective offline policy evaluation using logged data.
Abstract
Ad-load balancing is a critical challenge in online advertising systems, particularly in the context of social media platforms, where the goal is to maximize user engagement and revenue while maintaining a satisfactory user experience. This requires the optimization of conflicting objectives, such as user satisfaction and ads revenue. Traditional approaches to ad-load balancing rely on static allocation policies, which fail to adapt to changing user preferences and contextual factors. In this paper, we present an approach that leverages off-policy learning and evaluation from logged bandit feedback. We start by presenting a motivating analysis of the ad-load balancing problem, highlighting the conflicting objectives between user satisfaction and ads revenue. We emphasize the nuances that arise due to user heterogeneity and the dependence on the user's position within a session. Based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodsfail
