Pre-trained Multiple Latent Variable Generative Models are good   defenders against Adversarial Attacks

Dario Serez; Marco Cristani; Alessio Del Bue; Vittorio Murino; Pietro; Morerio

arXiv:2412.03453·cs.CV·December 5, 2024

Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks

Dario Serez, Marco Cristani, Alessio Del Bue, Vittorio Murino, Pietro, Morerio

PDF

Open Access 1 Repo

TL;DR

Pre-trained Multiple Latent Variable Generative Models (MLVGMs) can effectively purify adversarial noise from images without additional training, offering a training-free defense mechanism that leverages disentangled features.

Contribution

This paper introduces the use of pre-trained MLVGMs for adversarial purification, demonstrating their effectiveness as a training-free defense against adversarial attacks.

Findings

01

Smaller MLVGMs are competitive with traditional methods.

02

Pre-trained MLVGMs generalize well to adversarial purification.

03

The approach is training-free and leverages disentangled features.

Abstract

Attackers can deliberately perturb classifiers' input with subtle noise, altering final predictions. Among proposed countermeasures, adversarial purification employs generative networks to preprocess input images, filtering out adversarial noise. In this study, we propose specific generators, defined Multiple Latent Variable Generative Models (MLVGMs), for adversarial purification. These models possess multiple latent variables that naturally disentangle coarse from fine features. Taking advantage of these properties, we autoencode images to maintain class-relevant information, while discarding and re-sampling any detail, including adversarial noise. The procedure is completely training-free, exploring the generalization abilities of pre-trained MLVGMs on the adversarial purification downstream task. Despite the lack of large models, trained on billions of samples, we show that smaller…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

serezd/gen_adversarial
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications