NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics
Jiawei Zhou, Woojeong Kim, Zhiying Xu, Alexander M. Rush, Minlan Yu

TL;DR
NetFlowGen introduces a pre-training framework for network traffic modeling using unlabeled NetFlow data, enabling efficient adaptation to various tasks like attack detection with limited labeled data.
Contribution
The paper presents a novel large-scale self-supervised pre-training approach for network traffic data, improving model generalization and reducing reliance on task-specific labeled datasets.
Findings
Effective traffic dynamics modeling demonstrated
Enhanced performance on downstream tasks like DDoS detection
Pre-training reduces labeled data requirements
Abstract
Understanding the traffic dynamics in networks is a core capability for automated systems to monitor and analyze networking behaviors, reducing expensive human efforts and economic risks through tasks such as traffic classification, congestion prediction, and attack detection. However, it is still challenging to accurately model network traffic with machine learning approaches in an efficient and broadly applicable manner. Task-specific models trained from scratch are used for different networking applications, which limits the efficiency of model development and generalization of model deployment. Furthermore, while networking data is abundant, high-quality task-specific labels are often insufficient for training individual models. Large-scale self-supervised learning on unlabeled data provides a natural pathway for tackling these challenges. We propose to pre-train a general-purpose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
