Improving DGA-Based Malicious Domain Classifiers for Malware Defense   with Adversarial Machine Learning

Ibrahim Yilmaz; Ambareen Siraj; Denis Ulybyshev

arXiv:2101.00521·cs.CR·January 5, 2021

Improving DGA-Based Malicious Domain Classifiers for Malware Defense with Adversarial Machine Learning

Ibrahim Yilmaz, Ambareen Siraj, Denis Ulybyshev

PDF

TL;DR

This paper enhances DGA-based malicious domain classifiers using LSTM with novel features, introduces adversarial techniques to generate unseen malicious domains, and proposes secure data containers to protect blacklists.

Contribution

It presents a new LSTM-based classifier with innovative feature engineering, an adversarial method to generate unseen malicious domains, and secure containers for blacklists.

Findings

01

LSTM classifier outperforms previous models in accuracy.

02

Adversarial generation of new malicious domains reveals classifier weaknesses.

03

Secure data containers protect blacklists from tampering.

Abstract

Domain Generation Algorithms (DGAs) are used by adversaries to establish Command and Control (C\&C) server communications during cyber attacks. Blacklists of known/identified C\&C domains are often used as one of the defense mechanisms. However, since blacklists are static and generated by signature-based approaches, they can neither keep up nor detect never-seen-before malicious domain names. Due to this shortcoming of blacklist domain checking, machine learning algorithms have been used to address the problem to some extent. However, when training is performed with limited datasets, the algorithms are likely to fail in detecting new DGA variants. To mitigate this weakness, we successfully applied a DGA-based malicious domain classifier using the Long Short-Term Memory (LSTM) method with a novel feature engineering technique. Our model's performance shows a higher level of accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.