TL;DR
This paper introduces LLTA, a meta-learning based method to generate more transferable adversarial attacks by augmenting data and models, significantly improving transfer success rates in black-box scenarios.
Contribution
The paper proposes a novel learning-to-learn framework with data and model augmentation for more effective transfer adversarial attacks, outperforming existing methods.
Findings
Achieves 12.85% higher transfer attack success rate than state-of-the-art.
Demonstrates effectiveness on both benchmark datasets and real-world systems.
Utilizes meta-learning with data and model augmentation to enhance attack generalization.
Abstract
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model. However, the transferability of perturbations from existing methods is still limited, since the adversarial perturbations are easily overfitting with a single surrogate model and specific data pattern. In this paper, we propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation. For data augmentation, we adopt simple random resizing and padding. For model augmentation, we randomly alter the back propagation instead of the forward propagation to eliminate the effect on the model prediction. By treating the attack of both specific data and a modified model as a task, we expect the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
