Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks
Dario Serez, Marco Cristani, Alessio Del Bue, Vittorio Murino, Pietro, Morerio

TL;DR
Pre-trained Multiple Latent Variable Generative Models (MLVGMs) can effectively purify adversarial noise from images without additional training, offering a training-free defense mechanism that leverages disentangled features.
Contribution
This paper introduces the use of pre-trained MLVGMs for adversarial purification, demonstrating their effectiveness as a training-free defense against adversarial attacks.
Findings
Smaller MLVGMs are competitive with traditional methods.
Pre-trained MLVGMs generalize well to adversarial purification.
The approach is training-free and leverages disentangled features.
Abstract
Attackers can deliberately perturb classifiers' input with subtle noise, altering final predictions. Among proposed countermeasures, adversarial purification employs generative networks to preprocess input images, filtering out adversarial noise. In this study, we propose specific generators, defined Multiple Latent Variable Generative Models (MLVGMs), for adversarial purification. These models possess multiple latent variables that naturally disentangle coarse from fine features. Taking advantage of these properties, we autoencode images to maintain class-relevant information, while discarding and re-sampling any detail, including adversarial noise. The procedure is completely training-free, exploring the generalization abilities of pre-trained MLVGMs on the adversarial purification downstream task. Despite the lack of large models, trained on billions of samples, we show that smaller…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
