Generating Labeled Flow Data from MAWILab Traces for Network Intrusion Detection
Jinoh Kim, Caitlin Sim, Jinhwan Choi

TL;DR
This paper presents a method to generate labeled network flow datasets by combining packet meta-information with IDS logs, facilitating machine learning-based intrusion detection research.
Contribution
It introduces a novel approach to create labeled flow data from existing network traces, addressing the scarcity of publicly available datasets for intrusion detection.
Findings
Enables access to labeled network flow datasets for research
Uses NetFlow-compatible format for broad device compatibility
Facilitates machine learning applications in intrusion detection
Abstract
A growing issue in the modern cyberspace world is the direct identification of malicious activity over network connections. The boom of the machine learning industry in the past few years has led to the increasing usage of machine learning technologies, which are especially prevalent in the network intrusion detection research community. When utilizing these fairly contemporary techniques, the community has realized that datasets are pivotal for identifying malicious packets and connections, particularly ones associated with information concerning labeling in order to construct learning models. However, there exists a shortage of publicly available, relevant datasets to researchers in the network intrusion detection community. Thus, in this paper, we introduce a method to construct labeled flow data by combining the packet meta-information with IDS logs to infer labels for intrusion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Internet Traffic Analysis and Secure E-voting · Network Packet Processing and Optimization
