NetFlow Datasets for Machine Learning-based Network Intrusion Detection   Systems

Mohanad Sarhan; Siamak Layeghy; Nour Moustafa; Marius Portmann

arXiv:2011.09144·cs.NI·May 18, 2021

NetFlow Datasets for Machine Learning-based Network Intrusion Detection Systems

Mohanad Sarhan, Siamak Layeghy, Nour Moustafa, Marius Portmann

PDF

TL;DR

This paper introduces NetFlow feature datasets derived from four benchmark NIDS datasets to facilitate consistent evaluation of ML-based intrusion detection systems across different data sources.

Contribution

It provides publicly available NetFlow datasets from multiple benchmarks, enabling more reliable and comparable ML model evaluations for network intrusion detection.

Findings

01

NetFlow features yield similar binary classification results across datasets.

02

Multi-class classification performance is lower with NetFlow features compared to original features.

03

NetFlow datasets are easier to extract and publicly available for research.

Abstract

Machine Learning (ML)-based Network Intrusion Detection Systems (NIDSs) have proven to become a reliable intelligence tool to protect networks against cyberattacks. Network data features has a great impact on the performances of ML-based NIDSs. However, evaluating ML models often are not reliable, as each ML-enabled NIDS is trained and validated using different data features that may do not contain security events. Therefore, a common ground feature set from multiple datasets is required to evaluate an ML model's detection accuracy and its ability to generalise across datasets. This paper presents NetFlow features from four benchmark NIDS datasets known as UNSW-NB15, BoT-IoT, ToN-IoT, and CSE-CIC-IDS2018 using their publicly available packet capture files. In a real-world scenario, NetFlow features are relatively easier to extract from network traffic compared to the complex features…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.