Profiling user activities with minimal traffic traces
Tiep Mai, Deepak Ajwani, Alessandra Sala

TL;DR
This paper introduces a privacy-preserving method for profiling user activities using truncated web traces, achieving high accuracy by analyzing URL bursts and inter-arrival times, thus balancing user privacy with personalized service needs.
Contribution
It presents a novel statistical approach leveraging URL bursts and inter-arrival times to accurately profile user activities from truncated web traffic data.
Findings
Achieves around 90% accuracy in activity profiling
Effectively distinguishes user activity URLs from noise
Demonstrates scalability on large mobile web trace dataset
Abstract
Understanding user behavior is essential to personalize and enrich a user's online experience. While there are significant benefits to be accrued from the pursuit of personalized services based on a fine-grained behavioral analysis, care must be taken to address user privacy concerns. In this paper, we consider the use of web traces with truncated URLs - each URL is trimmed to only contain the web domain - for this purpose. While such truncation removes the fine-grained sensitive information, it also strips the data of many features that are crucial to the profiling of user activity. We show how to overcome the severe handicap of lack of crucial features for the purpose of filtering out the URLs representing a user activity from the noisy network traffic trace (including advertisement, spam, analytics, webscripts) with high accuracy. This activity profiling with truncated URLs enables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting · Privacy, Security, and Data Protection · Spam and Phishing Detection
