Detecting new obfuscated malware variants: A lightweight and interpretable machine learning approach
Oladipo A. Madamidola, Felix Ngobigha, Adnane Ez-zizi

TL;DR
This paper introduces a lightweight, interpretable machine learning system that can detect new obfuscated malware variants, trained on a single malware subtype but capable of identifying multiple unseen subtypes with high accuracy.
Contribution
The study demonstrates that training on one malware subtype with feature selection enables detection of diverse unseen malware subtypes, advancing adaptable malware detection methods.
Findings
Achieved over 99.8% accuracy in detecting 15 malware subtypes.
System processes files in approximately 5.7 microseconds.
Feature selection improved interpretability and maintained high performance.
Abstract
Machine learning has been successfully applied in developing malware detection systems, with a primary focus on accuracy, and increasing attention to reducing computational overhead and improving model interpretability. However, an important question remains underexplored: How well can machine learning-based models detect entirely new forms of malware not present in the training data? In this study, we present a machine learning-based system for detecting obfuscated malware that is not only highly accurate, lightweight and interpretable, but also capable of successfully adapting to new types of malware attacks. Our system is capable of detecting 15 malware subtypes despite being exclusively trained on one malware subtype, namely the Transponder from the Spyware family. This system was built after training 15 distinct random forest-based models, each on a different malware subtype from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Spam and Phishing Detection · Cybercrime and Law Enforcement Studies
MethodsSoftmax · Attention Is All You Need · Focus · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
