Scalable Audience Reach Estimation in Real-time Online Advertising
Ali Jalali, Santanu Kolay, Peter Foldes, Ali Dasdan

TL;DR
This paper presents a distributed, fault-tolerant system for real-time online advertising reach estimation, balancing speed and accuracy by using stratified sampling on representative data samples across multiple machines.
Contribution
It introduces a novel distributed sampling-based forecasting system that improves accuracy and speed over existing industry methods for online ad reach estimation.
Findings
Significant accuracy improvements over uniform sampling.
Faster estimation suitable for real-time applications.
Effective handling of minority groups through fuzzy fallback.
Abstract
Online advertising has been introduced as one of the most efficient methods of advertising throughout the recent years. Yet, advertisers are concerned about the efficiency of their online advertising campaigns and consequently, would like to restrict their ad impressions to certain websites and/or certain groups of audience. These restrictions, known as targeting criteria, limit the reachability for better performance. This trade-off between reachability and performance illustrates a need for a forecasting system that can quickly predict/estimate (with good accuracy) this trade-off. Designing such a system is challenging due to (a) the huge amount of data to process, and, (b) the need for fast and accurate estimates. In this paper, we propose a distributed fault tolerant system that can generate such estimates fast with good accuracy. The main idea is to keep a small representative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Algorithms and Data Compression · Web Data Mining and Analysis
