Bias in the Shadows: Explore Shortcuts in Encrypted Network Traffic Classification
Chuyi Wang, Xiaohui Xie, Tongze Wang, Yong Cui

TL;DR
This paper introduces BiasSeeker, a model-agnostic, data-driven framework that detects shortcut features in encrypted network traffic, improving understanding and reducing bias in classification models.
Contribution
BiasSeeker is the first semi-automated, model-agnostic tool for identifying dataset-specific shortcut features in encrypted traffic using statistical correlation analysis.
Findings
Effective detection of shortcut features across 19 datasets
Enhances model generalization by reducing bias
Provides a systematic approach for feature validation
Abstract
Pre-trained models operating directly on raw bytes have achieved promising performance in encrypted network traffic classification (NTC), but often suffer from shortcut learning-relying on spurious correlations that fail to generalize to real-world data. Existing solutions heavily rely on model-specific interpretation techniques, which lack adaptability and generality across different model architectures and deployment scenarios. In this paper, we propose BiasSeeker, the first semi-automated framework that is both model-agnostic and data-driven for detecting dataset-specific shortcut features in encrypted traffic. By performing statistical correlation analysis directly on raw binary traffic, BiasSeeker identifies spurious or environment-entangled features that may compromise generalization, independent of any classifier. To address the diverse nature of shortcut features, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting · Network Security and Intrusion Detection · Advanced Malware Detection Techniques
