CriteoPrivateAds: A Real-World Bidding Dataset to Design Private Advertising Systems
Mehdi Sebbar, Corentin Odic, Mathieu L\'echine, Alo\"is, Bissuel, Nicolas Chrysanthos, Anthony D'Amato, Alexandre Gilotte and, Fabian H\"oring, Sarah Nogueira, Maxime Vono

TL;DR
This paper introduces CriteoPrivateAds, a comprehensive anonymized dataset derived from real-world ad bidding logs, designed to facilitate the development and evaluation of privacy-preserving online advertising models.
Contribution
It provides the first large-scale, feature-rich dataset aligned with privacy-preserving proposals, enabling realistic offline testing of private advertising systems.
Findings
Dataset closely mimics production ad performance
Supports various privacy constraints like differential privacy
Facilitates development of privacy-aware bidding models
Abstract
In the past years, many proposals have emerged in order to address online advertising use-cases without access to third-party cookies. All these proposals leverage some privacy-enhancing technologies such as aggregation or differential privacy. Yet, no public and rich-enough ground truth is currently available to assess the relevancy of aforementioned private advertising frameworks. We are releasing the largest, in terms of number of features, bidding dataset specifically built in alignment with the design of major browser vendors proposals such as Chrome Privacy Sandbox. This dataset, coined CriteoPrivateAds, stands for an anonymised version of Criteo production logs and provides sufficient data to learn bidding models commonly used in online advertising under many privacy constraints (delayed reports, display and user-level differential privacy, user signal quantisation or aggregated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Privacy, Security, and Data Protection · Internet Traffic Analysis and Secure E-voting
