Detecting DNS Tunnels Using Character Frequency Analysis
Kenton Born, David Gustafson

TL;DR
This paper proposes a method to detect DNS tunnels by analyzing character frequency distributions in domain names, distinguishing covert channels from legitimate traffic based on linguistic patterns.
Contribution
It introduces a novel character frequency analysis approach that detects DNS tunnels across multiple domains, improving upon previous point-to-point detection methods.
Findings
Domains follow Zipf's law similar to natural languages
Tunneled traffic exhibits more uniform character distributions
The method enables quick anomaly detection across multiple domains
Abstract
High-bandwidth covert channels pose significant risks to sensitive and proprietary information inside company networks. Domain Name System (DNS) tunnels provide a means to covertly infiltrate and exfiltrate large amounts of information passed network boundaries. This paper explores the possibility of detecting DNS tunnels by analyzing the unigram, bigram, and trigram character frequencies of domains in DNS queries and responses. It is empirically shown how domains follow Zipf's law in a similar pattern to natural languages, whereas tunneled traffic has more evenly distributed character frequencies. This approach allows tunnels to be detected across multiple domains, whereas previous methods typically concentrate on monitoring point to point systems. Anomalies are quickly discovered when tunneled traffic is compared to the character frequency fingerprint of legitimate domain traffic.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting · Network Security and Intrusion Detection · Spam and Phishing Detection
