Predicting Domain Generation Algorithms with Long Short-Term Memory   Networks

Jonathan Woodbridge; Hyrum S. Anderson; Anjum Ahuja; Daniel Grant

arXiv:1611.00791·cs.CR·November 4, 2016·209 cites

Predicting Domain Generation Algorithms with Long Short-Term Memory Networks

Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja, Daniel Grant

PDF

Open Access 3 Repos

TL;DR

This paper introduces an LSTM-based classifier for detecting and identifying malware-generated domain names, significantly outperforming existing methods with high accuracy and low false positive rates.

Contribution

The paper presents a novel LSTM-based approach for DGA detection that does not require feature extraction, achieving superior accuracy over state-of-the-art techniques.

Findings

01

Achieves 0.9993 AUC in binary DGA detection.

02

Attains a micro-averaged F1 score of 0.9906.

03

Reduces false positives by twenty times compared to previous methods.

Abstract

Various families of malware use domain generation algorithms (DGAs) to generate a large number of pseudo-random domain names to connect to a command and control (C&C) server. In order to block DGA C&C traffic, security organizations must first discover the algorithm by reverse engineering malware samples, then generating a list of domains for a given seed. The domains are then either preregistered or published in a DNS blacklist. This process is not only tedious, but can be readily circumvented by malware authors using a large number of seeds in algorithms with multivariate recurrence properties (e.g., banjori) or by using a dynamic list of seeds (e.g., bedep). Another technique to stop malware from using DGAs is to intercept DNS queries on a network and predict whether domains are DGA generated. Such a technique will alert network administrators to the presence of malware on their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Spam and Phishing Detection

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory