Sufficient Conditions for Idealised Models to Have No Adversarial   Examples: a Theoretical and Empirical Study with Bayesian Neural Networks

Yarin Gal; Lewis Smith

arXiv:1806.00667·stat.ML·July 2, 2018·27 cites

Sufficient Conditions for Idealised Models to Have No Adversarial Examples: a Theoretical and Empirical Study with Bayesian Neural Networks

Yarin Gal, Lewis Smith

PDF

Open Access

TL;DR

This paper establishes theoretical conditions under which idealised models, including Bayesian neural networks, have no adversarial examples, and empirically demonstrates these ideas using HMC inference on synthetic and real image data.

Contribution

It provides the first theoretical conditions for models to lack adversarial examples and empirically validates these conditions with Bayesian neural networks using HMC inference.

Findings

01

Idealised models can have no adversarial examples under certain conditions.

02

Near-idealised BNNs with HMC show high epistemic uncertainty on off-manifold images.

03

Dropout-based models can be bypassed with new attack methods, highlighting limitations.

Abstract

We prove, under two sufficient conditions, that idealised models can have no adversarial examples. We discuss which idealised models satisfy our conditions, and show that idealised Bayesian neural networks (BNNs) satisfy these. We continue by studying near-idealised BNNs using HMC inference, demonstrating the theoretical ideas in practice. We experiment with HMC on synthetic data derived from MNIST for which we know the ground-truth image density, showing that near-perfect epistemic uncertainty correlates to density under image manifold, and that adversarial images lie off the manifold in our setting. This suggests why MC dropout, which can be seen as performing approximate inference, has been observed to be an effective defence against adversarial examples in practice; We highlight failure-cases of non-idealised BNNs relying on dropout, suggesting a new attack for dropout models and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research · Anomaly Detection Techniques and Applications

MethodsDropout