An Automated Social Graph De-anonymization Technique

Kumar Sharad; George Danezis

arXiv:1408.1276·cs.CR·August 8, 2014

An Automated Social Graph De-anonymization Technique

Kumar Sharad, George Danezis

PDF

Open Access

TL;DR

This paper introduces an automated machine learning-based method for de-anonymizing social network nodes, effectively evaluating and exposing weaknesses in anonymization techniques using real-world datasets.

Contribution

It presents a novel, automated approach employing decision forests to re-identify nodes across anonymized social graphs, even with limited training data.

Findings

01

High true positive rates achieved in re-identification

02

Effective even with small training samples

03

Can transfer learning across different social networks

Abstract

We present a generic and automated approach to re-identifying nodes in anonymized social networks which enables novel anonymization techniques to be quickly evaluated. It uses machine learning (decision forests) to matching pairs of nodes in disparate anonymized sub-graphs. The technique uncovers artefacts and invariants of any black-box anonymization scheme from a small set of examples. Despite a high degree of automation, classification succeeds with significant true positive rates even when small false positive rates are sought. Our evaluation uses publicly available real world datasets to study the performance of our approach against real-world anonymization strategies, namely the schemes used to protect datasets of The Data for Development (D4D) Challenge. We show that the technique is effective even when only small numbers of samples are used for training. Further, since it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Privacy, Security, and Data Protection · Advanced Graph Neural Networks