Salient Feature Extractor for Adversarial Defense on Deep Neural Networks
Jinyin Chen, Ruoxi Chen, Haibin Zheng, Zhaoyan Ming, Wenrong Jiang and, Chen Cui

TL;DR
This paper introduces a novel salient feature extractor (SFE) that uses coupled GANs to detect and defend against adversarial attacks in deep neural networks by distinguishing class-related features from misleading ones, achieving state-of-the-art results.
Contribution
The paper proposes a new SFE method that leverages coupled GANs to extract and compare salient and trivial features for adversarial detection and defense, providing interpretability and improved performance.
Findings
SFE outperforms baseline methods on MNIST, CIFAR-10, and ImageNet datasets.
The method effectively detects adversarial examples by comparing salient and trivial features.
SFE offers an interpretable approach to understanding adversarial defense mechanisms.
Abstract
Recent years have witnessed unprecedented success achieved by deep learning models in the field of computer vision. However, their vulnerability towards carefully crafted adversarial examples has also attracted the increasing attention of researchers. Motivated by the observation that adversarial examples are due to the non-robust feature learned from the original dataset by models, we propose the concepts of salient feature(SF) and trivial feature(TF). The former represents the class-related feature, while the latter is usually adopted to mislead the model. We extract these two features with coupled generative adversarial network model and put forward a novel detection and defense method named salient feature extractor (SFE) to defend against adversarial attacks. Concretely, detection is realized by separating and comparing the difference between SF and TF of the input. At the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
