TL;DR
The paper introduces SDM, a new gradient-based attack method that significantly improves model robustness evaluation by addressing previous limitations and achieving stronger attack performance.
Contribution
It proposes a novel three-stage optimization framework with specific loss functions, enhancing the effectiveness of gradient-based adversarial attacks.
Findings
SDM outperforms previous attack methods in strength.
SDM demonstrates superior cost-effectiveness.
Experiments validate SDM's effectiveness across models.
Abstract
Gradient-based attacks are important methods for evaluating model robustness. However, since the proposal of APGD, it has been difficult for such methods to achieve significant breakthroughs. To achieve such an effect, we first analyze the issue of "high-loss non-adversarial examples" that degrades attack performance in previous methods, and prove that this issue arises from inappropriate objectives for adversarial example generation. Subsequently, we reconstruct the objective as "maximizing the difference between the non-ground-truth label probability upper bound and the ground-truth label probability", and proposes a novel and powerful gradient-based attack method named Sequential Difference Maximization (SDM). SDM establishes a three-layer optimization framework of "cycle-stage-step". It adopts the negative probability loss function and the Directional Probability Difference Ratio…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
