Listen to Minority: Encrypted Traffic Classification for Class Imbalance with Contrastive Pre-Training
Xiang Li, Juncheng Guo, Qige Song, Jiang Xie, Yafei Sang, Shuyuan, Zhao, and Yongzheng Zhang

TL;DR
This paper introduces PASS, a semi-supervised, contrastive pre-training framework for encrypted traffic classification that effectively handles class imbalance, traffic homogeneity, and reduces reliance on labeled data, outperforming existing methods.
Contribution
The paper proposes a novel contrastive pre-training and semi-supervised strategy that addresses class imbalance, traffic homogeneity, and label scarcity in encrypted traffic classification.
Findings
PASS outperforms state-of-the-art ETC methods on multiple datasets.
Contrastive pre-training improves feature robustness for overlapping traffic.
Pseudo-label iteration enhances utilization of unlabeled data.
Abstract
Mobile Internet has profoundly reshaped modern lifestyles in various aspects. Encrypted Traffic Classification (ETC) naturally plays a crucial role in managing mobile Internet, especially with the explosive growth of mobile apps using encrypted communication. Despite some existing learning-based ETC methods showing promising results, three-fold limitations still remain in real-world network environments, 1) label bias caused by traffic class imbalance, 2) traffic homogeneity caused by component sharing, and 3) training with reliance on sufficient labeled traffic. None of the existing ETC methods can address all these limitations. In this paper, we propose a novel Pre-trAining Semi-Supervised ETC framework, dubbed PASS. Our key insight is to resample the original train dataset and perform contrastive pre-training without using individual app labels directly to avoid label bias issues…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting · Network Security and Intrusion Detection · Hate Speech and Cyberbullying Detection
MethodsMulti-Head Attention · Attention Is All You Need · None · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Residual Connection · Relative Position Encodings · Dense Connections · InfoNCE
