Behavioural Reports of Multi-Stage Malware
Marcus Carpenter, Chunbo Luo

TL;DR
This paper introduces a new dataset of API call sequences from malware samples, supporting multi-label classification of malicious behaviors to enhance host-based intrusion detection systems.
Contribution
It presents a novel dataset with long API call sequences and a multi-label classification system for malware behavior analysis, aiding in improving anti-malware detection.
Findings
The dataset includes thousands of malware samples with detailed API call sequences.
Three feature selection methods were tested for resource-efficient model deployment.
The multi-label classification approach allows for detailed behavior tagging of malware.
Abstract
The extensive damage caused by malware requires anti-malware systems to be constantly improved to prevent new threats. The current trend in malware detection is to employ machine learning models to aid in the classification process. We propose a new dataset with the objective of improving current anti-malware systems. The focus of this dataset is to improve host based intrusion detection systems by providing API call sequences for thousands of malware samples executed in Windows 10 virtual machines. A tutorial on how to create and expand this dataset is provided along with a benchmark demonstrating how to use this dataset to classify malware. The data contains long sequences of API calls for each sample, and in order to create models that can be deployed in resource constrained devices, three feature selection methods were tested. The principal innovation, however, lies in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Spam and Phishing Detection
MethodsFeature Selection
