Reassessing feature-based Android malware detection in a contemporary context
Ali Muzaffar, Hani Ragab Hassen, Hind Zantout, Michael A Lones

TL;DR
This study reevaluates feature-based Android malware detection methods from 2013-2023 using a large, contemporary dataset, finding that simple static and dynamic features with basic models remain highly effective.
Contribution
It provides a comprehensive reimplementation and evaluation of foundational studies, highlighting the continued relevance of simple feature-based approaches in modern Android malware detection.
Findings
Detection accuracy exceeds 98% with feature-based methods.
Static features like API calls and opcodes are highly productive.
Ensemble models effectively combine static and dynamic features.
Abstract
We report the findings of a reimplementation of 18 foundational studies in feature-based machine learning for Android malware detection, published during the period 2013-2023. These studies are reevaluated on a level playing field using a contemporary Android environment and a balanced dataset of 124,000 applications. Our findings show that feature-based approaches can still achieve detection accuracies beyond 98%, despite a considerable increase in the size of the underlying Android feature sets. We observe that features derived through dynamic analysis yield only a small benefit over those derived from static analysis, and that simpler models often out-perform more complex models. We also find that API calls and opcodes are the most productive static features within our evaluation context, network traffic is the most predictive dynamic feature, and that ensemble models provide an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
