Salient Feature Extractor for Adversarial Defense on Deep Neural   Networks

Jinyin Chen; Ruoxi Chen; Haibin Zheng; Zhaoyan Ming; Wenrong Jiang and; Chen Cui

arXiv:2105.06807·cs.CV·May 17, 2021

Salient Feature Extractor for Adversarial Defense on Deep Neural Networks

Jinyin Chen, Ruoxi Chen, Haibin Zheng, Zhaoyan Ming, Wenrong Jiang and, Chen Cui

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel salient feature extractor (SFE) that uses coupled GANs to detect and defend against adversarial attacks in deep neural networks by distinguishing class-related features from misleading ones, achieving state-of-the-art results.

Contribution

The paper proposes a new SFE method that leverages coupled GANs to extract and compare salient and trivial features for adversarial detection and defense, providing interpretability and improved performance.

Findings

01

SFE outperforms baseline methods on MNIST, CIFAR-10, and ImageNet datasets.

02

The method effectively detects adversarial examples by comparing salient and trivial features.

03

SFE offers an interpretable approach to understanding adversarial defense mechanisms.

Abstract

Recent years have witnessed unprecedented success achieved by deep learning models in the field of computer vision. However, their vulnerability towards carefully crafted adversarial examples has also attracted the increasing attention of researchers. Motivated by the observation that adversarial examples are due to the non-robust feature learned from the original dataset by models, we propose the concepts of salient feature(SF) and trivial feature(TF). The former represents the class-related feature, while the latter is usually adopted to mislead the model. We extract these two features with coupled generative adversarial network model and put forward a novel detection and defense method named salient feature extractor (SFE) to defend against adversarial attacks. Concretely, detection is realized by separating and comparing the difference between SF and TF of the input. At the same…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haibinzheng/SFE
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning