Robust identification of email tracking: A machine learning approach
Johannes Haupt, Benedict Bender, Benjamin Fabian, Stefan Lessmann

TL;DR
This paper presents a machine learning-based detection engine to identify email tracking images, aiming to enhance user privacy by blocking covert tracking techniques in email communications.
Contribution
It introduces a set of novel, efficient features for tracking image detection, and evaluates their effectiveness across diverse real-world scenarios using multiple classifiers.
Findings
High detection accuracy on out-of-sample data
Features are resilient to changes in tracking infrastructure
Effective detection across different countries and industries
Abstract
Email tracking allows email senders to collect fine-grained behavior and location data on email recipients, who are uniquely identifiable via their email address. Such tracking invades user privacy in that email tracking techniques gather data without user consent or awareness. Striving to increase privacy in email communication, this paper develops a detection engine to be the core of a selective tracking blocking mechanism in the form of three contributions. First, a large collection of email newsletters is analyzed to show the wide usage of tracking over different countries, industries and time. Second, we propose a set of features geared towards the identification of tracking images under real-world conditions. Novel features are devised to be computationally feasible and efficient, generalizable and resilient towards changes in tracking infrastructure. Third, we test the predictive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
