Improving Adversarial Transferability with Gradient Refining
Guoqiu Wang, Huanqian Yan, Ying Guo, Xingxing Wei

TL;DR
This paper introduces Gradient Refining, a novel technique to enhance the transferability of adversarial examples across models, significantly improving black-box attack success rates on ImageNet.
Contribution
Gradient Refining corrects useless gradients in input diversity-based attacks, boosting transferability and outperforming existing methods in black-box adversarial attacks.
Findings
Achieves 82.07% transfer success rate on ImageNet.
Outperforms state-of-the-art methods by 6.0% on average.
Secured second place in CVPR 2021 adversarial attack competition.
Abstract
Deep neural networks are vulnerable to adversarial examples, which are crafted by adding human-imperceptible perturbations to original images. Most existing adversarial attack methods achieve nearly 100% attack success rates under the white-box setting, but only achieve relatively low attack success rates under the black-box setting. To improve the transferability of adversarial examples for the black-box setting, several methods have been proposed, e.g., input diversity, translation-invariant attack, and momentum-based attack. In this paper, we propose a method named Gradient Refining, which can further improve the adversarial transferability by correcting useless gradients introduced by input diversity through multiple transformations. Our method is generally applicable to many gradient-based attack methods combined with input diversity. Extensive experiments are conducted on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
