Support-aware offline policy selection for advertising marketplaces
Prashant Shekhar, Caroline Howard

TL;DR
This paper introduces a support-aware offline decision framework for reserve-price policy selection in advertising auctions, emphasizing certified validation over simple ranking.
Contribution
It develops a conservative, support-aware evaluation method that certifies policies and quantifies uncertainty, improving offline policy validation in ad marketplaces.
Findings
Achieved a 47.66% replay lift in real-time bidding logs.
Reduced policy catalog from 19 to 2 for validation.
Certifies non-harm across multiple segments.
Abstract
Logged advertising auctions make offline reserve-price evaluation attractive but risky. Replay tables can identify policies with large apparent yield gains, yet they can also hide weak threshold support, multiple-comparison effects, subgroup harm, and bidder-response uncertainty. Existing replay and off-policy evaluation methods estimate or rank policy values, but they do not directly answer the operational question of whether the available evidence is strong enough to justify validation. This paper develops a support-aware offline decision framework for reserve-policy selection. Rather than outputting a single point-estimate winner, the framework converts logged evidence into a conservative decision object consisting of certified policies, statistically dominated alternatives, and unresolved candidates requiring further validation. The main theoretical result gives a unified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
