eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys
Joshua Saxe, Konstantin Berlin

TL;DR
eXpose introduces a deep learning model that automatically learns features from raw character strings to detect malicious URLs, file paths, and registry keys, outperforming traditional manual feature-based methods.
Contribution
The paper presents a novel character-level CNN with embeddings that automates feature extraction for security inputs, reducing manual effort and improving detection accuracy.
Findings
Outperforms manual feature extraction baselines
Achieves 5-10% higher detection rate at 0.1% false positive rate
Effective on URLs, file paths, and registry keys
Abstract
For years security machine learning research has promised to obviate the need for signature based detection by automatically learning to detect indicators of attack. Unfortunately, this vision hasn't come to fruition: in fact, developing and maintaining today's security machine learning systems can require engineering resources that are comparable to that of signature-based detection systems, due in part to the need to develop and continuously tune the "features" these machine learning systems look at as attacks evolve. Deep learning, a subfield of machine learning, promises to change this by operating on raw input signals and automating the process of feature design and extraction. In this paper we propose the eXpose neural network, which uses a deep learning approach we have developed to take generic, raw short character strings as input (a common case for security inputs, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Spam and Phishing Detection · Advanced Malware Detection Techniques
