Testing the performance of Multi-class IDS public dataset using Supervised Machine Learning Algorithms
Vusumuzi Malele, Topside E Mathonsi

TL;DR
This paper evaluates the performance of various supervised machine learning algorithms on a public multi-class intrusion detection dataset, finding Random Forest to be the most effective overall.
Contribution
It compares four supervised algorithms on a public dataset for intrusion detection, highlighting the strengths of Random Forest in accuracy and K-Nearest Neighbours in training time.
Findings
Random Forest achieved the highest prediction accuracy.
K-Nearest Neighbours was fastest in training time.
XGBoost and ANN showed competitive performance.
Abstract
Machine learning, statistical-based, and knowledge-based methods are often used to implement an Anomaly-based Intrusion Detection System which is software that helps in detecting malicious and undesired activities in the network primarily through the Internet. Machine learning comprises Supervised, Semi-Supervised, and Unsupervised Learning algorithms. Supervised machine learning uses a trained label dataset. This paper uses four supervised learning algorithms Random Forest, XGBoost, K-Nearest Neighbours, and Artificial Neural Network to test the performance of the public dataset. Based on the prediction accuracy rate, the results show that Random Forest performs better on multi-class Intrusion Detection System, followed by XGBoost, K-Nearest Neighbours respective, provided prediction accuracy is taken into perspective. Otherwise, K-Nearest Neighbours was the best performer considering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques
