Learning Temporal Invariance in Android Malware Detectors
Xinran Zheng, Shuo Yang, Edith C.H. Ngai, Suman Jana, Lorenzo Cavallaro

TL;DR
This paper introduces TIF, a novel temporal invariant training framework that improves Android malware detectors' robustness over time by learning stable features across temporal distribution shifts, outperforming existing methods.
Contribution
The paper presents the first temporal invariant training framework for malware detection, addressing challenges of distribution drift without prior environment labels.
Findings
TIF significantly improves detection stability over a decade-long dataset.
TIF outperforms state-of-the-art methods, especially in early deployment stages.
The framework effectively learns stable representations across temporal environments.
Abstract
Learning-based Android malware detectors degrade over time due to natural distribution drift caused by malware variants and new families. This paper systematically investigates the challenges classifiers trained with empirical risk minimization (ERM) face against such distribution shifts and attributes their shortcomings to their inability to learn stable discriminative features. Invariant learning theory offers a promising solution by encouraging models to generate stable representations crossing environments that expose the instability of the training set. However, the lack of prior environment labels, the diversity of drift factors, and low-quality representations caused by diverse families make this task challenging. To address these issues, we propose TIF, the first temporal invariant training framework for malware detection, which aims to enhance the ability of detectors to learn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Spam and Phishing Detection
MethodsContrastive Learning · ALIGN
