Reality Check for Tor Website Fingerprinting in the Open World
Mohammadhamed Shadbeh, Khashayar Khajavi, Tao Wang

TL;DR
This paper re-evaluates the effectiveness of website fingerprinting attacks on Tor in real-world conditions using a large-scale, open-world dataset, demonstrating high attack accuracy despite network variability and traffic splitting.
Contribution
It introduces a novel methodology for creating open-world datasets from real traffic, benchmarks state-of-the-art attacks in this setting, and analyzes robustness factors affecting attack success.
Findings
WF remains highly effective in real-world open-world traffic
Timing-independent classifiers are more robust to network variability
Guard nodes with latency advantages can sustain high attack effectiveness
Abstract
Website fingerprinting (WF) attacks on Tor can infer user destinations from encrypted traffic metadata. However, their real-world effectiveness remains debated due to laboratory settings that fail to capture network fluctuations, evaluate noise, and create a representative open world. In this work, we re-examine WF from a guard-relay vantage point using a novel, privacy-preserving methodology that builds an open-world background from real, unlabeled Tor traffic paired with synthetic monitored traces. Using this methodology, we collect a large-scale dataset of over 800,000 traces. We then benchmark state-of-the-art WF attacks under a cross-network setting and show that WF remains highly effective against real Tor open-world traffic: the best-performing attack achieves 0.956 precision and 0.922 recall at a 9% base rate. We further present results that demonstrate robustness to small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting · Advanced Steganography and Watermarking Techniques · Wireless Signal Modulation Classification
