Learning from Context: Exploiting and Interpreting File Path Information for Better Malware Detection
Adarsh Kyadige, Ethan M. Rudd, Konstantin Berlin

TL;DR
This paper introduces a multi-view neural network that leverages file path context alongside PE file content to improve malware detection accuracy, demonstrating significant false negative rate reductions on a large real-world dataset.
Contribution
The study proposes a novel multi-view neural network model that incorporates file path information into static malware detection, enhancing detection performance with minimal additional infrastructure.
Findings
32.3% reduction in false negatives at 10^-3 FPR
33.1% reduction in false negatives at 10^-4 FPR
Model learns meaningful file path features and artifacts
Abstract
Machine learning (ML) used for static portable executable (PE) malware detection typically employs per-file numerical feature vector representations as input with one or more target labels during training. However, there is much orthogonal information that can be gleaned from the \textit{context} in which the file was seen. In this paper, we propose utilizing a static source of contextual information -- the path of the PE file -- as an auxiliary input to the classifier. While file paths are not malicious or benign in and of themselves, they do provide valuable context for a malicious/benign determination. Unlike dynamic contextual information, file paths are available with little overhead and can seamlessly be integrated into a multi-view static ML detector, yielding higher detection rates at very high throughput with minimal infrastructural changes. Here we propose a multi-view neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Anomaly Detection Techniques and Applications
MethodsInterpretability · Local Interpretable Model-Agnostic Explanations
