A Robust Comparison of the KDDCup99 and NSL-KDD IoT Network Intrusion Detection Datasets Through Various Machine Learning Algorithms
Suchet Sapre, Pouyan Ahmadi, Khondkar Islam

TL;DR
This study compares the KDDCup99 and NSL-KDD IoT network intrusion datasets using various machine learning algorithms, revealing that NSL-KDD is of higher quality due to less bias and redundancy.
Contribution
It provides a comprehensive evaluation of both datasets with multiple metrics, highlighting the superior quality of NSL-KDD for intrusion detection research.
Findings
NSL-KDD dataset yields more reliable classifier performance.
Classifiers trained on KDDCup99 are biased by redundancies.
NSL-KDD classifiers are on average 20.18% less accurate, indicating higher data quality.
Abstract
In recent years, as intrusion attacks on IoT networks have grown exponentially, there is an immediate need for sophisticated intrusion detection systems (IDSs). A vast majority of current IDSs are data-driven, which means that one of the most important aspects of this area of research is the quality of the data acquired from IoT network traffic. Two of the most cited intrusion detection datasets are the KDDCup99 and the NSL-KDD. The main goal of our project was to conduct a robust comparison of both datasets by evaluating the performance of various Machine Learning (ML) classifiers trained on them with a larger set of classification metrics than previous researchers. From our research, we were able to conclude that the NSL-KDD dataset is of a higher quality than the KDDCup99 dataset as the classifiers trained on it were on average 20.18% less accurate. This is because the classifiers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
