Why Adversarial Reprogramming Works, When It Fails, and How to Tell the Difference
Yang Zheng, Xiaoyi Feng, Zhaoqiang Xia, Xiaoyue Jiang, Ambra Demontis,, Maura Pintor, Battista Biggio, Fabio Roli

TL;DR
This paper investigates the factors influencing the success of adversarial reprogramming, revealing that input gradient size and input dimensionality are key determinants, supported by theoretical modeling and extensive experiments.
Contribution
It introduces a first-order linear model explaining adversarial reprogramming success and identifies key factors affecting its effectiveness.
Findings
Success correlates with larger average input gradients.
Higher input dimensionality increases reprogramming success.
Alignment of input gradients influences reprogramming outcomes.
Abstract
Adversarial reprogramming allows repurposing a machine-learning model to perform a different task. For example, a model trained to recognize animals can be reprogrammed to recognize digits by embedding an adversarial program in the digit images provided as input. Recent work has shown that adversarial reprogramming may not only be used to abuse machine-learning models provided as a service, but also beneficially, to improve transfer learning when training data is scarce. However, the factors affecting its success are still largely unexplained. In this work, we develop a first-order linear model of adversarial reprogramming to show that its success inherently depends on the size of the average input gradient, which grows when input gradients are more aligned, and when inputs have higher dimensionality. The results of our experimental analysis, involving fourteen distinct reprogramming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Integrated Circuits and Semiconductor Failure Analysis
Methodstravel james
