I-MAD: Interpretable Malware Detector Using Galaxy Transformer
Miles Q. Li, Benjamin C. M. Fung, Philippe Charland, Steven H.H. Ding

TL;DR
The paper introduces I-MAD, an interpretable malware detection model using Galaxy Transformer that effectively models assembly code semantics and provides clear explanations, outperforming existing static malware detectors.
Contribution
It proposes a novel Galaxy Transformer network for malware detection that captures assembly code semantics and an interpretable neural network for explanations, achieving superior accuracy.
Findings
Outperforms state-of-the-art static malware detection models
Provides meaningful interpretations of detection results
Achieves high detection accuracy with interpretability
Abstract
Malware currently presents a number of serious threats to computer users. Signature-based malware detection methods are limited in detecting new malware samples that are significantly different from known ones. Therefore, machine learning-based methods have been proposed, but there are two challenges these methods face. The first is to model the full semantics behind the assembly code of malware. The second challenge is to provide interpretable results while keeping excellent detection performance. In this paper, we propose an Interpretable MAlware Detector (I-MAD) that outperforms state-of-the-art static malware detection models regarding accuracy with excellent interpretability. To improve the detection performance, I-MAD incorporates a novel network component called the Galaxy Transformer network that can understand assembly code at the basic block, function, and executable levels.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
