Adversarial Purification through Representation Disentanglement

Tao Bai; Jun Zhao; Lanqing Guo; Bihan Wen

arXiv:2110.07801·cs.CV·October 18, 2021·1 cites

Adversarial Purification through Representation Disentanglement

Tao Bai, Jun Zhao, Lanqing Guo, Bihan Wen

PDF

Open Access

TL;DR

This paper introduces a novel adversarial purification method that disentangles natural images from adversarial perturbations, significantly improving robustness against unseen attacks without compromising clean accuracy.

Contribution

It proposes a new disentanglement-based purification scheme that enhances defense generalizability and effectiveness against strong, unseen adversarial attacks.

Findings

01

Reduces attack success rate from 61.7% to 14.9%.

02

Restores perturbed images perfectly.

03

Maintains clean accuracy of models.

Abstract

Deep learning models are vulnerable to adversarial examples and make incomprehensible mistakes, which puts a threat on their real-world deployment. Combined with the idea of adversarial training, preprocessing-based defenses are popular and convenient to use because of their task independence and good generalizability. Current defense methods, especially purification, tend to remove ``noise" by learning and recovering the natural images. However, different from random noise, the adversarial patterns are much easier to be overfitted during model training due to their strong correlation to the images. In this work, we propose a novel adversarial purification scheme by presenting disentanglement of natural images and adversarial perturbations as a preprocessing defense. With extensive experiments, our defense is shown to be generalizable and make significant protection against unseen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications