DiffDefense: Defending against Adversarial Attacks via Diffusion Models

Hondamunige Prasanna Silva; Lorenzo Seidenari; and Alberto Del Bimbo

arXiv:2309.03702·cs.LG·September 8, 2023

DiffDefense: Defending against Adversarial Attacks via Diffusion Models

Hondamunige Prasanna Silva, Lorenzo Seidenari, and Alberto Del Bimbo

PDF

1 Repo

TL;DR

This paper introduces DiffDefense, a diffusion model-based reconstruction technique that defends classifiers against adversarial attacks without modifying the classifiers, maintaining accuracy and speed.

Contribution

It proposes a novel diffusion model approach for adversarial defense that is fast, effective, and compatible with existing classifiers without modifications.

Findings

01

Robustness against adversarial attacks demonstrated

02

Maintains classifier accuracy on clean data

03

Compatible with existing models without changes

Abstract

This paper presents a novel reconstruction method that leverages Diffusion Models to protect machine learning classifiers against adversarial attacks, all without requiring any modifications to the classifiers themselves. The susceptibility of machine learning models to minor input perturbations renders them vulnerable to adversarial attacks. While diffusion-based methods are typically disregarded for adversarial defense due to their slow reverse process, this paper demonstrates that our proposed method offers robustness against adversarial threats while preserving clean accuracy, speed, and plug-and-play compatibility. Code at: https://github.com/HondamunigePrasannaSilva/DiffDefence.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hondamunigeprasannasilva/diffdefence
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDiffusion