TL;DR
This paper introduces DiffDefense, a diffusion model-based reconstruction technique that defends classifiers against adversarial attacks without modifying the classifiers, maintaining accuracy and speed.
Contribution
It proposes a novel diffusion model approach for adversarial defense that is fast, effective, and compatible with existing classifiers without modifications.
Findings
Robustness against adversarial attacks demonstrated
Maintains classifier accuracy on clean data
Compatible with existing models without changes
Abstract
This paper presents a novel reconstruction method that leverages Diffusion Models to protect machine learning classifiers against adversarial attacks, all without requiring any modifications to the classifiers themselves. The susceptibility of machine learning models to minor input perturbations renders them vulnerable to adversarial attacks. While diffusion-based methods are typically disregarded for adversarial defense due to their slow reverse process, this paper demonstrates that our proposed method offers robustness against adversarial threats while preserving clean accuracy, speed, and plug-and-play compatibility. Code at: https://github.com/HondamunigePrasannaSilva/DiffDefence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
