XRay: Enhancing the Web's Transparency with Differential Correlation
Mathias Lecuyer, Guillaume Ducoffe, Francis Lan, Andrei Papancea,, Theofilos Petsios, Riley Spahn, Augustin Chaintreau, and Roxana Geambasu

TL;DR
XRay is a novel system that enhances web transparency by accurately tracking how user data influences outputs like ads and recommendations through differential correlation, enabling users to understand data usage across services.
Contribution
It introduces XRay, the first scalable, service-agnostic system that predicts data-targeting relationships on the web using differential correlation techniques.
Findings
Achieves high precision and recall in identifying data usage.
Effective across multiple popular web services.
Requires only a small number of additional accounts for accurate predictions.
Abstract
Today's Web services - such as Google, Amazon, and Facebook - leverage user data for varied purposes, including personalizing recommendations, targeting advertisements, and adjusting prices. At present, users have little insight into how their data is being used. Hence, they cannot make informed choices about the services they choose. To increase transparency, we developed XRay, the first fine-grained, robust, and scalable personal data tracking system for the Web. XRay predicts which data in an arbitrary Web account (such as emails, searches, or viewed products) is being used to target which outputs (such as ads, recommended products, or prices). XRay's core functions are service agnostic and easy to instantiate for new services, and they can track data within and across services. To make predictions independent of the audited service, XRay relies on the following insight: by comparing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPeer-to-Peer Network Technologies · Internet Traffic Analysis and Secure E-voting · Advanced Malware Detection Techniques
