Stochastic Activation Pruning for Robust Adversarial Defense

Guneet S. Dhillon; Kamyar Azizzadenesheli; Zachary C. Lipton; Jeremy; Bernstein; Jean Kossaifi; Aran Khanna; Anima Anandkumar

arXiv:1803.01442·cs.LG·March 6, 2018·207 cites

Stochastic Activation Pruning for Robust Adversarial Defense

Guneet S. Dhillon, Kamyar Azizzadenesheli, Zachary C. Lipton, Jeremy, Bernstein, Jean Kossaifi, Aran Khanna, Anima Anandkumar

PDF

Open Access 1 Repo

TL;DR

This paper introduces Stochastic Activation Pruning (SAP), a novel defense mechanism that enhances neural network robustness against adversarial attacks by randomly pruning activations, inspired by game theory, without requiring model fine-tuning.

Contribution

SAP is a new stochastic activation pruning method that improves adversarial robustness of pretrained neural networks without additional training.

Findings

01

SAP increases adversarial robustness across multiple models.

02

SAP maintains model calibration under attack.

03

SAP improves accuracy against adversarial examples.

Abstract

Neural networks are known to be vulnerable to adversarial examples. Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild. To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model. In general, for such games, the optimal strategy for both players requires a stochastic policy, also known as a mixed strategy. In this light, we propose Stochastic Activation Pruning (SAP), a mixed strategy for adversarial defense. SAP prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate. We can apply SAP to pretrained networks, including adversarially trained models, without fine-tuning, providing robustness…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Guneet-Dhillon/Stochastic-Activation-Pruning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)

MethodsPruning