Exposed: Shedding Blacklight on Online Privacy
Lucas Shen, Gaurav Sood

TL;DR
This study combines browsing data and domain-level tracking information to quantify online surveillance, revealing widespread tracking, dominant organizations like Google, and demographic disparities in exposure.
Contribution
It provides a comprehensive analysis of web tracking prevalence, the dominant role of certain organizations, and demographic differences in surveillance exposure.
Findings
Over 99% of users encounter at least one tracker.
More invasive techniques are less common but still widespread.
Google tracks over 50% of web activity for many users.
Abstract
To what extent are users surveilled on the web, by what technologies, and by whom? We answer these questions by combining passively observed, anonymized browsing data of a large, representative sample of Americans with domain-level data on tracking from Blacklight. We find that nearly all users () encounter at least one ad tracker or third-party cookie over the observation window. More invasive techniques like session recording, keylogging, and canvas fingerprinting are less widespread, but over half of the users visited a site employing at least one of these within the first 48 hours of the start of tracking. Linking trackers to their parent organizations reveals that a single organization, usually Google, can track over of web activity of more than half the users. Demographic differences in exposure are modest and often attenuate when we account for browsing volume.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Spam and Phishing Detection · Mobile Crowdsensing and Crowdsourcing
