Adversarial defenses via a mixture of generators
Maciej \.Zelaszczyk, Jacek Ma\'ndziuk

TL;DR
This paper introduces a novel adversarial defense method using a mixture of generators trained adversarially to recover correct classifications of images affected by various attacks, without supervision.
Contribution
It is the first to employ a mixture-based adversarially trained system as a defense, capable of handling multiple unseen attacks simultaneously without supervision.
Findings
System effectively recovers class information on unseen adversarial examples.
Method is competitive with existing defenses in single-attack scenarios.
Works on MNIST dataset without requiring attack or data labels.
Abstract
In spite of the enormous success of neural networks, adversarial examples remain a relatively weakly understood feature of deep learning systems. There is a considerable effort in both building more powerful adversarial attacks and designing methods to counter the effects of adversarial examples. We propose a method to transform the adversarial input data through a mixture of generators in order to recover the correct class obfuscated by the adversarial attack. A canonical set of images is used to generate adversarial examples through potentially multiple attacks. Such transformed images are processed by a set of generators, which are trained adversarially as a whole to compete in inverting the initial transformations. To our knowledge, this is the first use of a mixture-based adversarially trained system as a defense mechanism. We show that it is possible to train such a system without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
