Breaching Euclidean Distance-Preserving Data Perturbation Using Few   Known Inputs

Chris Giannella; Kun Liu; Hillol Kargupta

arXiv:0911.2942·cs.DB·January 3, 2013

Breaching Euclidean Distance-Preserving Data Perturbation Using Few Known Inputs

Chris Giannella, Kun Liu, Hillol Kargupta

PDF

Open Access

TL;DR

This paper investigates the vulnerability of Euclidean distance-preserving data perturbation to attacks using few known original data points, demonstrating that such attacks can significantly compromise privacy even with limited prior knowledge.

Contribution

The study introduces a rigorous attack method leveraging small sets of known original data to estimate original data from perturbed data, analyzing its effectiveness and privacy implications.

Findings

01

Attacker with 4 known tuples estimates original data with less than 7% error.

02

Probability of successful privacy breach exceeds 80% with minimal known data.

03

Euclidean distance-preserving perturbation is vulnerable to small-set known-input attacks.

Abstract

We examine Euclidean distance-preserving data perturbation as a tool for privacy-preserving data mining. Such perturbations allow many important data mining algorithms e.g. hierarchical and k-means clustering), with only minor modification, to be applied to the perturbed data and produce exactly the same results as if applied to the original data. However, the issue of how well the privacy of the original data is preserved needs careful study. We engage in this study by assuming the role of an attacker armed with a small set of known original data tuples (inputs). Little work has been done examining this kind of attack when the number of known original tuples is less than the number of data dimensions. We focus on this important case, develop and rigorously analyze an attack that utilizes any number of known original tuples. The approach allows the attacker to estimate the original data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications