Diversity, Topology, and the Risk of Node Re-identification in Labeled   Social Graphs

Sameera Horawalavithana; Clayton Gandy; Juan Arroyo Flores; John; Skvoretz; Adriana Iamnitchi

arXiv:1808.10837·cs.SI·April 28, 2020

Diversity, Topology, and the Risk of Node Re-identification in Labeled Social Graphs

Sameera Horawalavithana, Clayton Gandy, Juan Arroyo Flores, John, Skvoretz, Adriana Iamnitchi

PDF

TL;DR

This paper investigates how binary node attributes and network topology influence privacy risks in social graphs, demonstrating that attribute diversity reduces anonymity and highlighting the importance of considering both factors in privacy assessments.

Contribution

It provides a quantitative analysis of how node attribute diversity and graph topology affect re-identification risks in labeled social networks.

Findings

01

Binary attribute diversity degrades node anonymity

02

Topology and attribute placement interact to influence privacy risk

03

Machine learning attacks can effectively re-identify nodes based on attributes

Abstract

Real network datasets provide significant benefits for understanding phenomena such as information diffusion or network evolution. Yet the privacy risks raised from sharing real graph datasets, even when stripped of user identity information, are significant. When nodes have associated attributes, the privacy risks increase. In this paper we quantitatively study the impact of binary node attributes on node privacy by employing machine-learning-based re-identification attacks and exploring the interplay between graph topology and attribute placement. Our experiments show that the population's diversity on the binary attribute consistently degrades anonymity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.