Detection Defense Against Adversarial Attacks with Saliency Map
Dengpan Ye, Chuanxi Chen, Changrui Liu, Hao Wang, Shunzhi Jiang

TL;DR
This paper introduces a saliency map-based detection method that effectively identifies adversarial examples in neural networks, enhancing interpretability and security without retraining or modifying models.
Contribution
A novel detection approach combining saliency maps and noise to identify adversarial attacks, outperforming existing methods in generality and effectiveness.
Findings
Detects all tested adversarial attacks with high success rate
Effective on datasets like ImageNet and popular models
More general than existing state-of-the-art techniques
Abstract
It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision and can cause the deep models misbehave. Such phenomenon may lead to severely inestimable consequences in the safety and security critical applications. Existing defenses are trend to harden the robustness of models against adversarial attacks, e.g., adversarial training technology. However, these are usually intractable to implement due to the high cost of re-training and the cumbersome operations of altering the model architecture or parameters. In this paper, we discuss the saliency map method from the view of enhancing model interpretability, it is similar to introducing the mechanism of the attention to the model, so as to comprehend the progress of object identification by the deep networks. We then propose a novel method combined with additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
