Towards Accurate Labeling of Android Apps for Reliable Malware Detection
Aleieldin Salem

TL;DR
This paper examines the challenges of using VirusTotal for labeling Android apps in malware detection, analyzes its limitations, and proposes guidelines and architecture for more reliable labeling platforms.
Contribution
It reveals VirusTotal's dynamicity issues affecting labeling accuracy and proposes a new architecture for alternative platforms to improve malware detection reliability.
Findings
VirusTotal's dynamicity impacts labeling consistency
Guidelines for using VirusTotal's threshold-based strategies
Proposed architecture for more reliable labeling platforms
Abstract
In training their newly-developed malware detection methods, researchers rely on threshold-based labeling strategies that interpret the scan reports provided by online platforms, such as VirusTotal. The dynamicity of this platform renders those labeling strategies unsustainable over prolonged periods, which leads to inaccurate labels. Using inaccurately labeled apps to train and evaluate malware detection methods significantly undermines the reliability of their results, leading to either dismissing otherwise promising detection approaches or adopting intrinsically inadequate ones. The infeasibility of generating accurate labels via manual analysis and the lack of reliable alternatives force researchers to utilize VirusTotal to label apps. In the paper, we tackle this issue in two manners. Firstly, we reveal the aspects of VirusTotal's dynamicity and how they impact threshold-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
