R+R: Revisiting Static Feature-Based Android Malware Detection using Machine Learning
Md Tanvirul Alam, Dipkamal Bhusal, Nidhi Rastogi

TL;DR
This paper critically examines static feature-based Android malware detection using machine learning, highlighting reproducibility issues and demonstrating that well-tuned, simpler models like XGBoost outperform complex neural networks.
Contribution
It introduces a rigorous evaluation methodology, addresses reproducibility concerns, and shows that simpler, well-tuned models can outperform complex neural networks in malware detection.
Findings
Simpler models like XGBoost outperform neural networks when properly tuned.
Removing dataset duplicates improves model performance.
Open-source code promotes reproducibility in security research.
Abstract
Static feature-based Android malware detection using machine learning (ML) remains critical due to its scalability and efficiency. However, existing approaches often overlook security-critical reproducibility concerns, such as dataset duplication, inadequate hyperparameter tuning, and variance from random initialization. This can significantly compromise the practical effectiveness of these systems. In this paper, we systematically investigate these challenges by proposing a more rigorous methodology for model selection and evaluation. Using two widely used datasets, Drebin and APIGraph, we evaluate six ML models of varying complexity under both offline and continuous active learning settings. Our analysis demonstrates that, contrary to popular belief, well-tuned, simpler models, particularly tree-based methods like XGBoost, consistently outperform more complex neural networks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Mobile and Web Applications · Network Security and Intrusion Detection
