An effective approach for classification of advanced malware with high accuracy
Ashu Sharma, Sanjay K. Sahay

TL;DR
This paper introduces a novel opcode-based classification method for advanced malware detection, achieving over 97% accuracy with Random Forest, surpassing previous methods.
Contribution
The paper presents a new opcode analysis approach combined with grouping strategies and evaluates multiple classifiers, notably improving detection accuracy for advanced malware.
Findings
Random forest achieved 97.95% accuracy.
The approach outperforms previous malware detection methods.
Thirteen classifiers were evaluated, with top five classifiers showing high performance.
Abstract
Combating malware is very important for software/systems security, but to prevent the software/systems from the advanced malware, viz. metamorphic malware is a challenging task, as it changes the structure/code after each infection. Therefore in this paper, we present a novel approach to detect the advanced malware with high accuracy by analyzing the occurrence of opcodes (features) by grouping the executables. These groups are made on the basis of our earlier studies [1] that the difference between the sizes of any two malware generated by popular advanced malware kits viz. PS-MPC, G2 and NGVCK are within 5 KB. On the basis of obtained promising features, we studied the performance of thirteen classifiers using N-fold cross-validation available in machine learning tool WEKA. Among these thirteen classifiers we studied in-depth top five classifiers (Random forest, LMT, NBT, J48 and FT)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
