Adversarial Eigen Attack on Black-Box Models
Linjun Zhou, Peng Cui, Yinan Jiang, Shiqiang Yang

TL;DR
This paper introduces EigenBA, a novel black-box adversarial attack method that leverages the Jacobian of a pre-trained white-box model to improve attack efficiency without additional training data.
Contribution
Proposes a new transferable black-box attack setting using pre-trained models without further tuning, and introduces EigenBA leveraging Jacobian singular vectors for efficient attacks.
Findings
EigenBA improves attack success rates on ImageNet and CIFAR-10.
Unlearnable pre-trained models can still enhance attack efficiency.
The method maintains small perturbations while increasing attack effectiveness.
Abstract
Black-box adversarial attack has attracted a lot of research interests for its practical use in AI safety. Compared with the white-box attack, a black-box setting is more difficult for less available information related to the attacked model and the additional constraint on the query budget. A general way to improve the attack efficiency is to draw support from a pre-trained transferable white-box model. In this paper, we propose a novel setting of transferable black-box attack: attackers may use external information from a pre-trained model with available network parameters, however, different from previous studies, no additional training data is permitted to further change or tune the pre-trained model. To this end, we further propose a new algorithm, EigenBA to tackle this problem. Our method aims to explore more gradient information of the black-box model, and promote the attack…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
