TL;DR
This paper introduces feature squeezing, a simple and efficient method to detect adversarial examples in deep neural networks by reducing input complexity and comparing model predictions.
Contribution
The paper proposes feature squeezing as a novel, computationally inexpensive defense mechanism for identifying adversarial examples in DNNs.
Findings
High detection accuracy for adversarial examples
Effective when combined with other defense strategies
Works with simple input transformations like color depth reduction and smoothing
Abstract
Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by \emph{adversarial examples} that are generated by adding small but purposeful distortions to natural examples. Previous studies to defend against adversarial examples mostly focused on refining the DNN models, but have either shown limited success or required expensive computation. We propose a new strategy, \emph{feature squeezing}, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model's prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives. This paper…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
See pages - of NDSS18_Feature_Squeezing.pdf
