Optimized Random Forest Model for Botnet Detection Based on DNS Queries

Abdallah Moubayed; MohammadNoor Injadat; Abdallah Shami

arXiv:2012.11326·cs.CR·December 22, 2020

Optimized Random Forest Model for Botnet Detection Based on DNS Queries

Abdallah Moubayed, MohammadNoor Injadat, Abdallah Shami

PDF

TL;DR

This paper introduces an optimized machine learning framework using feature selection and hyper-parameter tuning to enhance DNS-based botnet detection, achieving high accuracy and robustness on a standard dataset.

Contribution

It presents a novel combination of information gain and genetic algorithm for optimizing a random forest classifier specifically for DNS botnet detection.

Findings

01

Reduced feature set size by up to 60%.

02

Achieved high detection accuracy, precision, recall, and F-score.

03

Demonstrated robustness and effectiveness on TI-2016 dataset.

Abstract

The Domain Name System (DNS) protocol plays a major role in today's Internet as it translates between website names and corresponding IP addresses. However, due to the lack of processes for data integrity and origin authentication, the DNS protocol has several security vulnerabilities. This often leads to a variety of cyber-attacks, including botnet network attacks. One promising solution to detect DNS-based botnet attacks is adopting machine learning (ML) based solutions. To that end, this paper proposes a novel optimized ML-based framework to detect botnets based on their corresponding DNS queries. More specifically, the framework consists of using information gain as a feature selection method and genetic algorithm (GA) as a hyper-parameter optimization model to tune the parameters of a random forest (RF) classifier. The proposed framework is evaluated using a state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFeature Selection