Publishing Efficient On-device Models Increases Adversarial Vulnerability
Sanghyun Hong, Nicholas Carlini, Alexey Kurakin

TL;DR
Publishing efficient on-device neural network models enhances adversarial vulnerability of the original large models, but reducing similarity between models can significantly mitigate this risk.
Contribution
The paper demonstrates increased adversarial vulnerability due to on-device models and proposes a similarity-unpairing defense to mitigate this issue.
Findings
On-device models increase transfer-based attack success by up to 100x.
Reducing similarity between models decreases transferability by up to 90%.
The proposed defense significantly reduces attack queries by 10-100 times.
Abstract
Recent increases in the computational demands of deep neural networks (DNNs) have sparked interest in efficient deep learning mechanisms, e.g., quantization or pruning. These mechanisms enable the construction of a small, efficient version of commercial-scale models with comparable accuracy, accelerating their deployment to resource-constrained devices. In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. In evaluations across 19 DNNs, by exploiting the published on-device models as a transfer prior, the adversarial vulnerability of the original commercial-scale models increases by up to 100x. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase. Based on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Security and Verification in Computing
