Contaminant Removal for Android Malware Detection Systems
Lichao Sun, Xiaokai Wei, Jiawei Zhang, Lifang He, Philip S. Yu and, Witawas Srisa-an

TL;DR
This paper introduces PUDROID, a method that uses positive and unlabeled learning to automatically remove contaminated samples from training datasets, significantly enhancing Android malware detection accuracy.
Contribution
The paper presents PUDROID, a novel contaminant removal approach for training datasets, improving malware detection effectiveness by reducing dataset contamination.
Findings
Contaminant removal improves detection rate
Contaminant removal enhances detection accuracy
Feature selection further boosts performance
Abstract
A recent report indicates that there is a new malicious app introduced every 4 seconds. This rapid malware distribution rate causes existing malware detection systems to fall far behind, allowing malicious apps to escape vetting efforts and be distributed by even legitimate app stores. When trusted downloading sites distribute malware, several negative consequences ensue. First, the popularity of these sites would allow such malicious apps to quickly and widely infect devices. Second, analysts and researchers who rely on machine learning based detection techniques may also download these apps and mistakenly label them as benign since they have not been disclosed as malware. These apps are then used as part of their benign dataset during model training and testing. The presence of contaminants in benign dataset can compromise the effectiveness and accuracy of their detection and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Software Testing and Debugging Techniques
