NetML: A Challenge for Network Traffic Analytics
Onur Barut, Yan Luo, Tong Zhang, Weigang Li, Peilong Li

TL;DR
This paper introduces NetML, an open challenge with large, labeled network traffic datasets to facilitate reproducible AI research in traffic classification and malware detection.
Contribution
It provides three comprehensive datasets and benchmarks multiple machine learning methods, addressing reproducibility and data availability issues in network traffic analysis.
Findings
Datasets contain 1.3 million labeled flows with features and raw packets.
Open challenge promotes reproducible AI research in network traffic analysis.
Baseline machine learning methods demonstrate the datasets' utility.
Abstract
Classifying network traffic is the basis for important network applications. Prior research in this area has faced challenges on the availability of representative datasets, and many of the results cannot be readily reproduced. Such a problem is exacerbated by emerging data-driven machine learning based approaches. To address this issue, we provide three open datasets containing almost 1.3M labeled flows in total, with flow features and anonymized raw packets, for the research community. We focus on broad aspects in network traffic analysis, including both malware detection and application classification. We release the datasets in the form of an open challenge called NetML and implement several machine learning methods including random-forest, SVM and MLP. As we continue to grow NetML, we expect the datasets to serve as a common platform for AI driven, reproducible research on network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Internet Traffic Analysis and Secure E-voting · Anomaly Detection Techniques and Applications
MethodsSupport Vector Machine
