Optimized Random Forest Model for Botnet Detection Based on DNS Queries
Abdallah Moubayed, MohammadNoor Injadat, Abdallah Shami

TL;DR
This paper introduces an optimized machine learning framework using feature selection and hyper-parameter tuning to enhance DNS-based botnet detection, achieving high accuracy and robustness on a standard dataset.
Contribution
It presents a novel combination of information gain and genetic algorithm for optimizing a random forest classifier specifically for DNS botnet detection.
Findings
Reduced feature set size by up to 60%.
Achieved high detection accuracy, precision, recall, and F-score.
Demonstrated robustness and effectiveness on TI-2016 dataset.
Abstract
The Domain Name System (DNS) protocol plays a major role in today's Internet as it translates between website names and corresponding IP addresses. However, due to the lack of processes for data integrity and origin authentication, the DNS protocol has several security vulnerabilities. This often leads to a variety of cyber-attacks, including botnet network attacks. One promising solution to detect DNS-based botnet attacks is adopting machine learning (ML) based solutions. To that end, this paper proposes a novel optimized ML-based framework to detect botnets based on their corresponding DNS queries. More specifically, the framework consists of using information gain as a feature selection method and genetic algorithm (GA) as a hyper-parameter optimization model to tune the parameters of a random forest (RF) classifier. The proposed framework is evaluated using a state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFeature Selection
