On the Reliability of Profile Matching Across Large Online Social Networks
Oana Goga, Patrick Loiseau, Robin Sommer, Renata Teixeira, Krishna P., Gummadi

TL;DR
This paper evaluates the practical reliability of profile matching across large social networks using public attributes, revealing significant challenges and lower accuracy than prior studies suggested, due to real-world complexities.
Contribution
It introduces the ACID properties to assess profile attributes and proposes a new evaluation methodology for realistic profile matching scenarios.
Findings
Matching accuracy is lower in practice than prior estimates.
Many profiles with similar attributes cause false matches.
The paper highlights the limits of current profile matching techniques.
Abstract
Matching the profiles of a user across multiple online social networks brings opportunities for new services and applications as well as new insights on user online behavior, yet it raises serious privacy concerns. Prior literature has proposed methods to match profiles and showed that it is possible to do it accurately, but using evaluations that focused on sampled datasets only. In this paper, we study the extent to which we can reliably match profiles in practice, across real-world social networks, by exploiting public attributes, i.e., information users publicly provide about themselves. Today's social networks have hundreds of millions of users, which brings completely new challenges as a reliable matching scheme must identify the correct matching profile out of the millions of possible profiles. We first define a set of properties for profile attributes--Availability, Consistency,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Internet Traffic Analysis and Secure E-voting · Privacy-Preserving Technologies in Data
