Towards more accurate and useful data anonymity vulnerability measures

Paul Francis; David Wagner

arXiv:2403.06595·cs.CR·March 12, 2024·2 cites

Towards more accurate and useful data anonymity vulnerability measures

Paul Francis, David Wagner

PDF

Open Access

TL;DR

This paper critically evaluates existing data anonymization vulnerability measures, identifies common methodological flaws, and proposes improved frameworks for more accurate risk assessment in privacy-preserving data publishing.

Contribution

It introduces the non-member framework for better inference baselines and highlights the need for realistic base rates and clearer reporting in membership inference studies.

Findings

01

Many existing studies overstate privacy risks due to flawed baselines.

02

The non-member framework improves the accuracy of inference measures.

03

Most literature fails to use realistic base rates or clear reporting standards.

Abstract

The purpose of anonymizing structured data is to protect the privacy of individuals in the data while retaining the statistical properties of the data. There is a large body of work that examines anonymization vulnerabilities. Focusing on strong anonymization mechanisms, this paper examines a number of prominent attack papers and finds several problems, all of which lead to overstating risk. First, some papers fail to establish a correct statistical inference baseline (or any at all), leading to incorrect measures. Notably, the reconstruction attack from the US Census Bureau that led to a redesign of its disclosure method made this mistake. We propose the non-member framework, an improved method for how to compute a more accurate inference baseline, and give examples of its operation. Second, some papers don't use a realistic membership base rate, leading to incorrect precision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Security and Verification in Computing · Network Security and Intrusion Detection