To believe or not to believe: Validating explanation fidelity for dynamic malware analysis
Li Chen, Carter Yagemann, Evan Downing

TL;DR
This paper evaluates the fidelity of explanation methods for deep learning models classifying malware images, demonstrating their potential and limitations in providing security insights.
Contribution
It extends local explanation algorithms to image-based malware classification and validates their effectiveness through two case studies.
Findings
Interpretation identifies exploit behaviors and cryptography APIs.
Explanation fidelity varies with image representation.
Current techniques show promise but have limitations.
Abstract
Converting malware into images followed by vision-based deep learning algorithms has shown superior threat detection efficacy compared with classical machine learning algorithms. When malware are visualized as images, visual-based interpretation schemes can also be applied to extract insights of why individual samples are classified as malicious. In this work, via two case studies of dynamic malware classification, we extend the local interpretable model-agnostic explanation algorithm to explain image-based dynamic malware classification and examine its interpretation fidelity. For both case studies, we first train deep learning models via transfer learning on malware images, demonstrate high classification effectiveness, apply an explanation method on the images, and correlate the results back to the samples to validate whether the algorithmic insights are consistent with security…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Anomaly Detection Techniques and Applications
