Lessons from the AdKDD'21 Privacy-Preserving ML Challenge
Eustache Diemert, Romain Fabre, Alexandre Gilotte, Fei Jia, Basile, Leparmentier, J\'er\'emie Mary, Zhonghua Qu, Ugo Tanielian, Hui Yang

TL;DR
This paper analyzes the AdKDD'21 Privacy-Preserving ML Challenge, highlighting the effectiveness of models trained on aggregated data with limited unaggregated data, and discusses implications for privacy-preserving advertising data sharing.
Contribution
It presents the challenge setup, datasets, results, and reproducibility, providing insights into privacy-preserving machine learning in online advertising.
Findings
Models on large aggregated data are surprisingly efficient and cost-effective.
Sensitivity of methods to privacy parameters and side information was evaluated.
Industry needs alternative data sharing designs or breakthroughs for effective private advertising data use.
Abstract
Designing data sharing mechanisms providing performance and strong privacy guarantees is a hot topic for the Online Advertising industry. Namely, a prominent proposal discussed under the Improving Web Advertising Business Group at W3C only allows sharing advertising signals through aggregated, differentially private reports of past displays. To study this proposal extensively, an open Privacy-Preserving Machine Learning Challenge took place at AdKDD'21, a premier workshop on Advertising Science with data provided by advertising company Criteo. In this paper, we describe the challenge tasks, the structure of the available datasets, report the challenge results, and enable its full reproducibility. A key finding is that learning models on large, aggregated data in the presence of a small set of unaggregated data points can be surprisingly efficient and cheap. We also run additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Privacy, Security, and Data Protection · Mobile Crowdsensing and Crowdsourcing
