Why Does Little Robustness Help? A Further Step Towards Understanding Adversarial Transferability

Yechao Zhang; Shengshan Hu; Leo Yu Zhang; Junyu Shi; Minghui Li; Xiaogeng Liu; Wei Wan; Hai Jin

arXiv:2307.07873·cs.LG·December 17, 2025

Why Does Little Robustness Help? A Further Step Towards Understanding Adversarial Transferability

Yechao Zhang, Shengshan Hu, Leo Yu Zhang, Junyu Shi, Minghui Li, Xiaogeng Liu, Wei Wan, Hai Jin

PDF

Open Access 1 Repo

TL;DR

This paper investigates why models with little robustness can serve as better surrogates for transfer attacks, focusing on the trade-off between model smoothness and gradient similarity, supported by theoretical and empirical analyses.

Contribution

It reveals the joint effects of model smoothness and gradient similarity on transferability and proposes a method combining input gradient regularization and SAM to improve surrogate models.

Findings

01

Little robustness enhances transferability due to the trade-off between smoothness and gradient similarity.

02

Data distribution shift in adversarial training reduces gradient similarity, affecting transferability.

03

Combining gradient regularization and SAM improves surrogate models for transfer attacks.

Abstract

Adversarial examples (AEs) for DNNs have been shown to be transferable: AEs that successfully fool white-box surrogate models can also deceive other black-box models with different architectures. Although a bunch of empirical studies have provided guidance on generating highly transferable AEs, many of these findings lack explanations and even lead to inconsistent advice. In this paper, we take a further step towards understanding adversarial transferability, with a particular focus on surrogate aspects. Starting from the intriguing little robustness phenomenon, where models adversarially trained with mildly perturbed adversarial samples can serve as better surrogates, we attribute it to a trade-off between two predominant factors: model smoothness and gradient similarity. Our investigations focus on their joint effects, rather than their separate correlations with transferability.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cgcl-codes/transferattacksurrogates
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)

MethodsSharpness-Aware Minimization · Focus