Browsing behavior exposes identities on the Web
Marcos Oliveira, Junran Yang, Daniel Griffiths, Denis Bonnay, Juhi, Kulshrestha

TL;DR
This paper demonstrates that individuals can be uniquely identified based on their web browsing patterns, with just a few visited domains, raising significant privacy concerns due to the stability and re-identifiability of these digital fingerprints.
Contribution
It reveals that minimal browsing data can reliably re-identify individuals, highlighting a substantial privacy vulnerability on the Web.
Findings
95% of individuals identified by four most visited domains
80% re-identification rate across different time periods
Digital fingerprints are stable over time
Abstract
How easy is it to uniquely identify a person based solely on their web browsing behavior? Here we show that when people navigate the Web, their online traces produce fingerprints that identify them. Merely the four most visited web domains are enough to identify 95% of the individuals. These digital fingerprints are stable and render high re-identifiability. We demonstrate that we can re-identify 80% of the individuals in separate time slices of data. Such a privacy threat persists even with limited information about individuals' browsing behavior, reinforcing existing concerns around online privacy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Misinformation and Its Impacts · Spam and Phishing Detection
