A Comprehensive Study of Supervised Machine Learning Models for Zero-Day Attack Detection: Analyzing Performance on Imbalanced Data
Zahra Lotfi, Mostafa Lotfi

TL;DR
This study evaluates supervised machine learning models for zero-day attack detection on imbalanced data, proposing a framework with techniques like grid search and oversampling, and finds XGBoost offers the best balance of speed and accuracy.
Contribution
It introduces a comprehensive framework combining hyperparameter tuning and data balancing to improve zero-day attack detection with supervised models.
Findings
XGBoost achieves high accuracy and speed in zero-day attack detection.
Random Forest performs best in accuracy but with longer processing times.
Oversampling improves model accuracy on imbalanced datasets.
Abstract
Among the various types of cyberattacks, identifying zero-day attacks is problematic because they are unknown to security systems as their pattern and characteristics do not match known blacklisted attacks. There are many Machine Learning (ML) models designed to analyze and detect network attacks, especially using supervised models. However, these models are designed to classify samples (normal and attacks) based on the patterns they learn during the training phase, so they perform inefficiently on unseen attacks. This research addresses this issue by evaluating five different supervised models to assess their performance and execution time in predicting zero-day attacks and find out which model performs accurately and quickly. The goal is to improve the performance of these supervised models by not only proposing a framework that applies grid search, dimensionality reduction and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Imbalanced Data Classification Techniques · Cybercrime and Law Enforcement Studies
