Can We Trust Your Explanations? Sanity Checks for Interpreters in Android Malware Analysis
Ming Fan, Wenying Wei, Xiaofei Xie, Yang Liu, Xiaohong Guan, Ting Liu

TL;DR
This paper introduces quantitative metrics to evaluate the trustworthiness of explanation methods in Android malware analysis, addressing the inconsistency and reliability issues of current interpretability techniques.
Contribution
It proposes a set of principled guidelines and metrics to assess explanation approaches, validated through experiments on multiple malware datasets and tasks.
Findings
Metrics effectively evaluate explanation stability, robustness, and effectiveness.
Most explanation approaches show varying levels of reliability in malware analysis.
The study enhances understanding of malicious behaviors through explanation assessment.
Abstract
With the rapid growth of Android malware, many machine learning-based malware analysis approaches are proposed to mitigate the severe phenomenon. However, such classifiers are opaque, non-intuitive, and difficult for analysts to understand the inner decision reason. For this reason, a variety of explanation approaches are proposed to interpret predictions by providing important features. Unfortunately, the explanation results obtained in the malware analysis domain cannot achieve a consensus in general, which makes the analysts confused about whether they can trust such results. In this work, we propose principled guidelines to assess the quality of five explanation approaches by designing three critical quantitative metrics to measure their stability, robustness, and effectiveness. Furthermore, we collect five widely-used malware datasets and apply the explanation approaches on them in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
