Enhancing Adversarial Transferability with Adversarial Weight Tuning

Jiahao Chen; Zhou Feng; Rui Zeng; Yuwen Pu; Chunyi Zhou; Yi Jiang; Yuyou Gan; Jinbao Li; Shouling Ji

arXiv:2408.09469·cs.CR·October 21, 2025

Enhancing Adversarial Transferability with Adversarial Weight Tuning

Jiahao Chen, Zhou Feng, Rui Zeng, Yuwen Pu, Chunyi Zhou, Yi Jiang, Yuyou Gan, Jinbao Li, Shouling Ji

PDF

Open Access

TL;DR

This paper introduces Adversarial Weight Tuning (AWT), a novel data-free method that enhances the transferability of adversarial examples by optimizing model smoothness and flat local maxima, outperforming existing attacks.

Contribution

The paper proposes AWT, a new adversarial attack method that adaptively tunes surrogate model parameters to improve transferability without extra data, grounded in insights about model smoothness and local maxima.

Findings

01

AWT achieves nearly 5 ext% and 10 ext% higher attack success rates on CNN and Transformer models.

02

Extensive experiments demonstrate AWT's superior transferability over state-of-the-art attacks.

03

Theoretical analysis links model smoothness and flat local maxima to adversarial transferability.

Abstract

Deep neural networks (DNNs) are vulnerable to adversarial examples (AEs) that mislead the model while appearing benign to human observers. A critical concern is the transferability of AEs, which enables black-box attacks without direct access to the target model. However, many previous attacks have failed to explain the intrinsic mechanism of adversarial transferability. In this paper, we rethink the property of transferable AEs and reformulate the formulation of transferability. Building on insights from this mechanism, we analyze the generalization of AEs across models with different architectures and prove that we can find a local perturbation to mitigate the gap between surrogate and target models. We further establish the inner connections between model smoothness and flat local maxima, both of which contribute to the transferability of AEs. Further, we propose a new adversarial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning