Grouping the executables to detect malware with high accuracy
Sanjay K. Sahay, Ashu Sharma

TL;DR
This paper proposes a malware detection method that groups executables by size using k-Means clustering, enabling high-accuracy classification of known and unknown malware variants with up to 99.11% accuracy.
Contribution
The study introduces a size-based grouping approach combined with machine learning classifiers for effective malware detection, especially for metamorphic variants.
Findings
Size-based grouping effectively clusters malware variants.
High detection accuracy of up to 99.11% achieved.
Method can classify unknown malware with high precision.
Abstract
The metamorphic malware variants with the same malicious behavior (family), can obfuscate themselves to look different from each other. This variation in structure leads to a huge signature database for traditional signature matching techniques to detect them. In order to effective and efficient detection of malware in large amounts of executables, we need to partition these files into groups which can identify their respective families. In addition, the grouping criteria should be chosen such a way that, it can also be applied to unknown files encounter on computers for classification. This paper discusses the study of malware and benign executables in groups to detect unknown malware with high accuracy. We studied sizes of malware generated by three popular second generation malware (metamorphic malware) creator kits viz. G2, PS-MPC and NGVCK, and observed that the size variation in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Spam and Phishing Detection
