Quantifying the Privacy Implications of High-Fidelity Synthetic Network Traffic
Van Tran, Shinan Liu, Tian Li, Nick Feamster

TL;DR
This paper introduces privacy metrics to evaluate the leakage risks of synthetic network traffic generated by various models, revealing significant vulnerabilities and factors influencing privacy breaches.
Contribution
It develops a comprehensive set of privacy metrics for synthetic network traffic and systematically evaluates model vulnerabilities, providing guidance for safer traffic generation.
Findings
Membership inference attack success varies from 0% to 88%.
Up to 100% of network identifiers can be recovered.
Model vulnerabilities depend on data diversity and fit.
Abstract
To address the scarcity and privacy concerns of network traffic data, various generative models have been developed to produce synthetic traffic. However, synthetic traffic is not inherently privacy-preserving, and the extent to which it leaks sensitive information, and how to measure such leakage, remain largely unexplored. This challenge is further compounded by the diversity of model architectures, which shape how traffic is represented and synthesized. We introduce a comprehensive set of privacy metrics for synthetic network traffic, combining standard approaches like membership inference attacks (MIA) and data extraction attacks with network-specific identifiers and attributes. Using these metrics, we systematically evaluate the vulnerability of different representative generative models and examine the factors that influence attack success. Our results reveal substantial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Internet Traffic Analysis and Secure E-voting · Software-Defined Networks and 5G
