Beyond the Crawl: Unmasking Browser Fingerprinting in Real User Interactions
Meenatchi Sundaram Muthu Selva Annamalai, Igor Bilogrevic, Emiliano De, Cristofaro

TL;DR
This study reveals that automated web crawls significantly underestimate real-world browser fingerprinting prevalence by missing nearly half of fingerprinting sites, emphasizing the importance of real user data for accurate detection and understanding.
Contribution
The paper introduces a large-scale user study capturing real browsing behavior, highlighting discrepancies with automated crawl data, and evaluates federated learning for improved fingerprinting detection.
Findings
Automated crawls miss 45% of fingerprinting websites encountered by real users.
Real user data uncovers new fingerprinting vectors absent in automated crawls.
Federated learning improves fingerprinting detection accuracy on real user data.
Abstract
Browser fingerprinting is a pervasive online tracking technique used increasingly often for profiling and targeted advertising. Prior research on the prevalence of fingerprinting heavily relied on automated web crawls, which inherently struggle to replicate the nuances of human-computer interactions. This raises concerns about the accuracy of current understandings of real-world fingerprinting deployments. As a result, this paper presents a user study involving 30 participants over 10 weeks, capturing telemetry data from real browsing sessions across 3,000 top-ranked websites. Our evaluation reveals that automated crawls miss almost half (45%) of the fingerprinting websites encountered by real users. This discrepancy mainly stems from the crawlers' inability to access authentication-protected pages, circumvent bot detection, and trigger fingerprinting scripts activated by specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Privacy, Security, and Data Protection · User Authentication and Security Systems
