Robust Neural Malware Detection Models for Emulation Sequence Learning
Rakshit Agrawal, Jack W. Stokes, Mady Marinescu, Karthik Selvaraj

TL;DR
This paper introduces robust neural models, including LSTM and CNN architectures, designed to detect malware from long emulated command sequences, effectively addressing sequence length limitations and evasion tactics.
Contribution
The paper proposes specialized neural models capable of handling extremely long sequences for malware detection, improving over previous models limited to 200 events.
Findings
Models successfully detect malware in long sequences
Efficient handling of sequences exceeding 200 events
Large dataset validation with 634,249 file sequences
Abstract
Malicious software, or malware, presents a continuously evolving challenge in computer security. These embedded snippets of code in the form of malicious files or hidden within legitimate files cause a major risk to systems with their ability to run malicious command sequences. Malware authors even use polymorphism to reorder these commands and create several malicious variations. However, if executed in a secure environment, one can perform early malware detection on emulated command sequences. The models presented in this paper leverage this sequential data derived via emulation in order to perform Neural Malware Detection. These models target the core of the malicious operation by learning the presence and pattern of co-occurrence of malicious event actions from within these sequences. Our models can capture entire event sequences and be trained directly using the known target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Anomaly Detection Techniques and Applications
