Online Learning in Contextual Second-Price Pay-Per-Click Auctions
Mengxiao Zhang, Haipeng Luo

TL;DR
This paper investigates online learning strategies for contextual pay-per-click auctions, proposing algorithms with provable regret bounds and demonstrating their effectiveness through experiments.
Contribution
It introduces two practical algorithms for contextual auction learning with theoretical regret guarantees, improving upon previous non-contextual bounds.
Findings
Achieved $\
Developed two algorithms with $\
Validated effectiveness through synthetic data experiments.
Abstract
We study online learning in contextual pay-per-click auctions where at each of the rounds, the learner receives some context along with a set of ads and needs to make an estimate on their click-through rate (CTR) in order to run a second-price pay-per-click auction. The learner's goal is to minimize her regret, defined as the gap between her total revenue and that of an oracle strategy that always makes perfect CTR predictions. We first show that -regret is obtainable via a computationally inefficient algorithm and that it is unavoidable since our algorithm is no easier than the classical multi-armed bandit problem. A by-product of our results is a -regret bound for the simpler non-contextual setting, improving upon a recent work of [Feng et al., 2023] by removing the inverse CTR dependency that could be arbitrarily large. Then, borrowing ideas from recent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Auction Theory and Applications
