Enhancing Adversarial Transferability with Adversarial Weight Tuning
Jiahao Chen, Zhou Feng, Rui Zeng, Yuwen Pu, Chunyi Zhou, Yi Jiang, Yuyou Gan, Jinbao Li, Shouling Ji

TL;DR
This paper introduces Adversarial Weight Tuning (AWT), a novel data-free method that enhances the transferability of adversarial examples by optimizing model smoothness and flat local maxima, outperforming existing attacks.
Contribution
The paper proposes AWT, a new adversarial attack method that adaptively tunes surrogate model parameters to improve transferability without extra data, grounded in insights about model smoothness and local maxima.
Findings
AWT achieves nearly 5 ext% and 10 ext% higher attack success rates on CNN and Transformer models.
Extensive experiments demonstrate AWT's superior transferability over state-of-the-art attacks.
Theoretical analysis links model smoothness and flat local maxima to adversarial transferability.
Abstract
Deep neural networks (DNNs) are vulnerable to adversarial examples (AEs) that mislead the model while appearing benign to human observers. A critical concern is the transferability of AEs, which enables black-box attacks without direct access to the target model. However, many previous attacks have failed to explain the intrinsic mechanism of adversarial transferability. In this paper, we rethink the property of transferable AEs and reformulate the formulation of transferability. Building on insights from this mechanism, we analyze the generalization of AEs across models with different architectures and prove that we can find a local perturbation to mitigate the gap between surrogate and target models. We further establish the inner connections between model smoothness and flat local maxima, both of which contribute to the transferability of AEs. Further, we propose a new adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
