Cardinality Estimators do not Preserve Privacy

Damien Desfontaines; Andreas Lochbihler; and David Basin

arXiv:1808.05879·cs.CR·December 20, 2018

Cardinality Estimators do not Preserve Privacy

Damien Desfontaines, Andreas Lochbihler, and David Basin

PDF

TL;DR

Cardinality estimators like HyperLogLog cannot simultaneously provide strong privacy guarantees and maintain their accuracy and aggregation properties, making them as sensitive as raw data.

Contribution

We formalize a privacy notion for cardinality estimators, demonstrate their incompatibility with strong privacy guarantees, and analyze existing algorithms' privacy risks.

Findings

01

Existing estimators leak significant privacy even with large multisets

02

Strong aggregation requirements conflict with privacy preservation

03

Proposed mitigation strategies for practical applications

Abstract

Cardinality estimators like HyperLogLog are sketching algorithms that estimate the number of distinct elements in a large multiset. Their use in privacy-sensitive contexts raises the question of whether they leak private information. In particular, can they provide any privacy guarantees while preserving their strong aggregation properties? We formulate an abstract notion of cardinality estimators, that captures this aggregation requirement: one can merge sketches without losing precision. We propose an attacker model and a corresponding privacy definition, strictly weaker than differential privacy: we assume that the attacker has no prior knowledge of the data. We then show that if a cardinality estimator satisfies this definition, then it cannot have a reasonable level of accuracy. We prove similar results for weaker versions of our definition, and analyze the privacy of existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.