PersonaFingerprint: Measuring Persona Inference on Modern Websites with LLM-Driven Browsing
Chuxu Song, Hao Wang, Richard Martin

TL;DR
This paper demonstrates that encrypted traffic metadata can reveal user personas and browsing behaviors on modern websites, using an LLM-driven framework to quantify and improve inference accuracy.
Contribution
It introduces a novel multi-agent browsing framework and formalizes persona fingerprinting, revealing significant privacy risks in modern web traffic analysis.
Findings
Persona inference accuracy reaches 84% on mixed-site traffic.
A lightweight multi-task model boosts persona accuracy to 80%.
Encrypted traffic metadata leaks both site and user persona information.
Abstract
Website Fingerprinting (WFP) has traditionally focused on inferring which website a user visits from encrypted traffic metadata such as packet sizes and timing. In this paper, we identify and quantify a new privacy risk in modern web settings: an adversary can infer a user's persona using only packet-length and inter-arrival-time sequences. To study this risk at scale, we build an LLM-driven multi-agent browsing framework that enforces controllable persona constraints while a computer-use agent interacts with real websites and collects corresponding encrypted traffic traces. We formalize persona fingerprinting under both closed-set and open-world settings and further evaluate whether persona information is already embedded in representations learned by existing WFP models and can be amplified at low cost. Across 10 modern websites and 15 personas (plus an open-world class), persona…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
