Anomaly Detection in Cyber Network Data Using a Cyber Language Approach
Bartley D. Richardson, Benjamin J. Radford, Shawn E. Davis, Keegan, Hines, David Pekarek

TL;DR
This paper introduces a novel unsupervised machine learning approach using natural language techniques, specifically suffix trees, to detect cyber anomalies and attacks in network data without relying on labeled datasets.
Contribution
It presents a new methodology that models cyber data as a language, enabling detection of diverse attacks through unsupervised learning with positive preliminary results.
Findings
Positive initial results in applying language-based models to flow data
Effective detection of cyber anomalies without labeled data
Potential to identify a wide range of cyber attacks
Abstract
As the amount of cyber data continues to grow, cyber network defenders are faced with increasing amounts of data they must analyze to ensure the security of their networks. In addition, new types of attacks are constantly being created and executed globally. Current rules-based approaches are effective at characterizing and flagging known attacks, but they typically fail when presented with a new attack or new types of data. By comparison, unsupervised machine learning offers distinct advantages by not requiring labeled data to learn from large amounts of network traffic. In this paper, we present a natural language-based technique (suffix trees) as applied to cyber anomaly detection. We illustrate one methodology to generate a language using cyber data features, and our experimental results illustrate positive preliminary results in applying this technique to flow-type data. As an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Internet Traffic Analysis and Secure E-voting
