Resisting Adversarial Attacks using Gaussian Mixture Variational   Autoencoders

Partha Ghosh; Arpan Losalka; Michael J Black

arXiv:1806.00081·cs.LG·December 11, 2018

Resisting Adversarial Attacks using Gaussian Mixture Variational Autoencoders

Partha Ghosh, Arpan Losalka, Michael J Black

PDF

TL;DR

This paper introduces a Gaussian mixture variational autoencoder framework that unifies the detection and rejection of adversarial and fooling samples, improving robustness of classifiers against attacks.

Contribution

It presents a novel VAE-based model with a Gaussian mixture prior that handles both adversarial and fooling samples within a single unified framework.

Findings

01

The model can reject adversarial samples effectively.

02

It enables semi-supervised learning of selective classifiers.

03

Reclassification of rejected samples is possible.

Abstract

Susceptibility of deep neural networks to adversarial attacks poses a major theoretical and practical challenge. All efforts to harden classifiers against such attacks have seen limited success. Two distinct categories of samples to which deep networks are vulnerable, "adversarial samples" and "fooling samples", have been tackled separately so far due to the difficulty posed when considered together. In this work, we show how one can address them both under one unified framework. We tie a discriminative model with a generative model, rendering the adversarial objective to entail a conflict. Our model has the form of a variational autoencoder, with a Gaussian mixture prior on the latent vector. Each mixture component of the prior distribution corresponds to one of the classes in the data. This enables us to perform selective classification, leading to the rejection of adversarial samples…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.