Statistical Privacy
Dennis Breutigam, R\"udiger Reischuk

TL;DR
This paper introduces a statistical privacy framework that models an attacker's prior knowledge as a data distribution, providing formulas to evaluate privacy guarantees and analyze the impact of noise and subsampling.
Contribution
It develops exact formulas for privacy parameters under statistical privacy, enabling practical estimation and comparison with differential privacy.
Findings
Formulas for privacy parameters depend on property fulfillment probability.
Adding noise and subsampling affects privacy-utility tradeoff.
Statistical privacy can be tightly estimated and compared to differential privacy.
Abstract
To analyze the privacy guarantee of personal data in a database that is subject to queries it is necessary to model the prior knowledge of a possible attacker. Differential privacy considers a worst-case scenario where he knows almost everything, which in many applications is unrealistic and requires a large utility loss. This paper considers a situation called statistical privacy where an adversary knows the distribution by which the database is generated, but no exact data of all (or sufficient many) of its entries. We analyze in detail how the entropy of the distribution guarantes privacy for a large class of queries called property queries. Exact formulas are obtained for the privacy parameters. We analyze how they depend on the probability that an entry fulfills the property under investigation. These formulas turn out to be lengthy, but can be used for tight numerical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
