Defense against Adversarial Attacks Using High-Level Representation   Guided Denoiser

Fangzhou Liao; Ming Liang; Yinpeng Dong; Tianyu Pang; Xiaolin Hu; Jun; Zhu

arXiv:1712.02976·cs.CV·May 9, 2018·22 cites

Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser

Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, Jun, Zhu

PDF

Open Access 2 Repos

TL;DR

This paper introduces a high-level representation guided denoiser (HGD) that effectively defends neural networks against adversarial attacks by reducing error amplification, improving robustness, and generalizing across models and unseen classes.

Contribution

The paper proposes HGD, a novel denoising method that overcomes error amplification and enhances adversarial robustness, outperforming existing ensemble adversarial training methods.

Findings

01

HGD improves robustness to white-box and black-box attacks.

02

HGD generalizes well to unseen images and classes.

03

HGD won first place in the NIPS defense competition.

Abstract

Neural networks are vulnerable to adversarial examples, which poses a threat to their application in security sensitive systems. We propose high-level representation guided denoiser (HGD) as a defense for image classification. Standard denoiser suffers from the error amplification effect, in which small residual adversarial noise is progressively amplified and leads to wrong classifications. HGD overcomes this problem by using a loss function defined as the difference between the target model's outputs activated by the clean image and denoised image. Compared with ensemble adversarial training which is the state-of-the-art defending method on large images, HGD has three advantages. First, with HGD as a defense, the target model is more robust to either white-box or black-box adversarial attacks. Second, HGD can be trained on a small subset of the images and generalizes well to other…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis · Anomaly Detection Techniques and Applications