Realizable Universal Adversarial Perturbations for Malware
Raphael Labaca-Castro, Luis Mu\~noz-Gonz\'alez, Feargus Pendlebury,, Gabi Dreo Rodosek, Fabio Pierazzi, Lorenzo Cavallaro

TL;DR
This paper investigates the use of Universal Adversarial Perturbations (UAPs) in malware classification, demonstrating their potential to identify vulnerabilities and proposing defenses that improve model robustness against such attacks.
Contribution
It introduces a method to generate problem-space transformations inducing UAPs in malware classifiers and compares adversarial training-based defenses with feature-space approaches.
Findings
White box Android evasion attack effectiveness ~20%
Adversarial training reduces attack success rate
Method extends to Windows malware domain
Abstract
Machine learning classifiers are vulnerable to adversarial examples -- input-specific perturbations that manipulate models' output. Universal Adversarial Perturbations (UAPs), which identify noisy patterns that generalize across the input space, allow the attacker to greatly scale up the generation of such examples. Although UAPs have been explored in application domains beyond computer vision, little is known about their properties and implications in the specific context of realizable attacks, such as malware, where attackers must satisfy challenging problem-space constraints. In this paper we explore the challenges and strengths of UAPs in the context of malware classification. We generate sequences of problem-space transformations that induce UAPs in the corresponding feature-space embedding and evaluate their effectiveness across different malware domains. Additionally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning
