Malicious Code Detection: Run Trace Output Analysis by LSTM
Cengiz Acarturk, Melih Sirlanci, Pinar Gurkan Balikcioglu, Deniz, Demirci, Nazenin Sahin, Ozge Acar Kucuk

TL;DR
This paper presents a novel LSTM-based framework for detecting malicious code by analyzing run trace outputs, achieving high accuracy and low false positive rates in dynamic analysis of PE files.
Contribution
It introduces a new methodological framework using LSTM models on run trace sequences for effective malware detection, including two sequence models with improved accuracy.
Findings
ISM achieved 87.51% accuracy with 18.34% false positive rate.
BSM achieved 99.26% accuracy with 2.62% false positive rate.
Dynamic analysis with sequence models enhances malware detection effectiveness.
Abstract
Malicious software threats and their detection have been gaining importance as a subdomain of information security due to the expansion of ICT applications in daily settings. A major challenge in designing and developing anti-malware systems is the coverage of the detection, particularly the development of dynamic analysis methods that can detect polymorphic and metamorphic malware efficiently. In the present study, we propose a methodological framework for detecting malicious code by analyzing run trace outputs by Long Short-Term Memory (LSTM). We developed models of run traces of malicious and benign Portable Executable (PE) files. We created our dataset from run trace outputs obtained from dynamic analysis of PE files. The obtained dataset was in the instruction format as a sequence and was called Instruction as a Sequence Model (ISM). By splitting the first dataset into basic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
