Towards a Fair Comparison and Realistic Evaluation Framework of Android Malware Detectors based on Static Analysis and Machine Learning
Borja Molina-Coronado, Usue Mori, Alexander Mendiburu, Jose, Miguel-Alonso

TL;DR
This paper critically analyzes existing Android malware detection methods based on static analysis and machine learning, highlighting the importance of realistic evaluation scenarios for fair comparison and improved future solutions.
Contribution
It introduces a common evaluation framework to assess ML-based Android malware detectors and identifies key factors affecting their performance and reproducibility.
Findings
Many detectors are evaluated optimistically, overestimating their effectiveness.
Factors like duplicates, label attribution, and app evolution significantly impact results.
Realistic evaluation scenarios are essential for genuine progress in malware detection.
Abstract
As in other cybersecurity areas, machine learning (ML) techniques have emerged as a promising solution to detect Android malware. In this sense, many proposals employing a variety of algorithms and feature sets have been presented to date, often reporting impresive detection performances. However, the lack of reproducibility and the absence of a standard evaluation framework make these proposals difficult to compare. In this paper, we perform an analysis of 10 influential research works on Android malware detection using a common evaluation framework. We have identified five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models and their performances. In particular, we analyze the effect of (1) the presence of duplicated samples, (2) label (goodware/greyware/malware) attribution, (3) class imbalance, (4) the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Mobile and Web Applications
