Differentially Private Weighted Sampling
Edith Cohen, Ofir Geri, Tamas Sarlos, Uri Stemmer

TL;DR
This paper introduces private weighted sampling (PWS), a method that ensures element-level differential privacy for datasets with key-frequency pairs, improving utility and reporting accuracy over prior methods, especially for skewed distributions.
Contribution
PWS is a novel private sampling technique that enhances differential privacy for weighted data, outperforming existing methods and enabling seamless integration with non-private schemes.
Findings
20%-300% increase in key reporting for Zipfian distributions
2-8 times lower estimation error for low-frequency keys
Effective as a post-processing step without access to original data
Abstract
Common datasets have the form of elements with keys (e.g., transactions and products) and the goal is to perform analytics on the aggregated form of key and frequency pairs. A weighted sample of keys by (a function of) frequency is a highly versatile summary that provides a sparse set of representative keys and supports approximate evaluations of query statistics. We propose private weighted sampling (PWS): A method that ensures element-level differential privacy while retaining, to the extent possible, the utility of a respective non-private weighted sample. PWS maximizes the reporting probabilities of keys and estimation quality of a broad family of statistics. PWS improves over the state of the art also for the well-studied special case of private histograms, when no sampling is performed. We empirically demonstrate significant performance gains compared with prior baselines:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Internet Traffic Analysis and Secure E-voting
