The Sweet Danger of Sugar: Debunking Representation Learning for Encrypted Traffic Classification
Yuqi Zhao, Giovanni Dettori, Matteo Boffa, Luca Vassio, Marco Mellia

TL;DR
This paper critically examines the effectiveness of representation learning models for encrypted traffic classification, revealing that many reported successes are due to data flaws and proposing a new model and evaluation standards.
Contribution
It uncovers data preparation issues in existing models, introduces Pcap-Encoder for feature extraction, and advocates for improved benchmarking and evaluation practices.
Findings
Many models rely on data shortcuts, inflating performance.
Pcap-Encoder effectively extracts protocol header features.
Current datasets and training methods are flawed and need revision.
Abstract
Recently we have witnessed the explosion of proposals that, inspired by Language Models like BERT, exploit Representation Learning models to create traffic representations. All of them promise astonishing performance in encrypted traffic classification (up to 98% accuracy). In this paper, with a networking expert mindset, we critically reassess their performance. Through extensive analysis, we demonstrate that the reported successes are heavily influenced by data preparation problems, which allow these models to find easy shortcuts - spurious correlation between features and labels - during fine-tuning that unrealistically boost their performance. When such shortcuts are not present - as in real scenarios - these models perform poorly. We also introduce Pcap-Encoder, an LM-based representation learning model that we specifically design to extract features from protocol headers.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
