Gradient-Free Adversarial Purification with Diffusion Models
Xuelong Dai, Dong Wang, Xiuzhen Cheng, Bin Xiao

TL;DR
This paper presents a novel, efficient defense framework against both perturbation-based and unrestricted adversarial attacks, combining adversarial anti-aliasing, super-resolution, and contrastive learning-based fine-tuning without retraining diffusion models.
Contribution
It introduces a gradient-free adversarial purification method using diffusion models, combining preprocessing and fine-tuning techniques for robust defense.
Findings
Effective against diverse adversarial attacks
No additional training required for purification techniques
Enhanced robustness through dataset-specific fine-tuning
Abstract
Adversarial training and adversarial purification are two widely used defense strategies for enhancing model robustness against adversarial attacks. However, adversarial training requires costly retraining, while adversarial purification often suffers from low efficiency. More critically, existing defenses are primarily designed under the perturbation-based adversarial threat model, which is ineffective against recently introduced unrestricted adversarial attacks. In this paper, we propose an effective and efficient defense framework that counters both perturbation-based and unrestricted adversarial attacks. Our approach is motivated by the observation that adversarial examples typically lie near the decision boundary and are highly sensitive to pixel-level perturbations. To address this, we introduce adversarial anti-aliasing, a preprocessing technique that mitigates adversarial noise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis
