Biologically inspired protection of deep networks from adversarial attacks
Aran Nayebi, Surya Ganguli

TL;DR
This paper introduces a biologically inspired training scheme for deep neural networks that enhances robustness against various adversarial attacks, achieving state-of-the-art results without adversarial training.
Contribution
The authors propose a novel biologically inspired method to train neural networks that significantly improves their robustness to adversarial examples, including iterative and second-order attacks.
Findings
Networks achieve state-of-the-art adversarial robustness on MNIST.
Robust networks develop flat, compressed internal representations.
Kurtotic weight distributions contribute to adversarial defense.
Abstract
Inspired by biophysical principles underlying nonlinear dendritic computation in neural circuits, we develop a scheme to train deep neural networks to make them robust to adversarial attacks. Our scheme generates highly nonlinear, saturated neural networks that achieve state of the art performance on gradient based adversarial examples on MNIST, despite never being exposed to adversarially chosen examples during training. Moreover, these networks exhibit unprecedented robustness to targeted, iterative schemes for generating adversarial examples, including second-order methods. We further identify principles governing how these networks achieve their robustness, drawing on methods from information geometry. We find these networks progressively create highly flat and compressed internal representations that are sensitive to very few input dimensions, while still solving the task.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Physical Unclonable Functions (PUFs) and Hardware Security
