A two-steps approach to improve the performance of Android malware detectors
Nadia Daoudi, Kevin Allix, Tegawend\'e F. Bissyand\'e, Jacques, Klein

TL;DR
This paper introduces GUIDED RETRAINING, a two-step supervised learning method that significantly improves Android malware detection accuracy by focusing on difficult samples and leveraging contrastive learning.
Contribution
The paper proposes a novel two-step retraining approach that enhances malware detectors by targeting difficult samples with supervised contrastive learning.
Findings
Reduces malware detection errors by up to 40.41%
Effective across multiple state-of-the-art detectors
Applicable to other binary classification tasks
Abstract
The popularity of Android OS has made it an appealing target to malware developers. To evade detection, including by ML-based techniques, attackers invest in creating malware that closely resemble legitimate apps. In this paper, we propose GUIDED RETRAINING, a supervised representation learning-based method that boosts the performance of a malware detector. First, the dataset is split into "easy" and "difficult" samples, where difficulty is associated to the prediction probabilities yielded by a malware detector: for difficult samples, the probabilities are such that the classifier is not confident on the predictions, which have high error rates. Then, we apply our GUIDED RETRAINING method on the difficult samples to improve their classification. For the subset of "easy" samples, the base malware detector is used to make the final predictions since the error rate on that subset is low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Mobile and Web Applications
MethodsBalanced Selection · Auxiliary Classifier · Contrastive Learning
