Adversarial robustness of VAEs through the lens of local geometry
Asif Khan, Amos Storkey

TL;DR
This paper investigates the vulnerability of VAEs to adversarial attacks by analyzing the local geometry of the latent space through the pullback metric tensor, proposing new robustness scores and a mixup training method to enhance robustness.
Contribution
It introduces a geometric perspective using the pullback metric tensor to evaluate and improve VAE robustness against adversarial perturbations.
Findings
Pullback metric tensor eigenspectrum correlates with robustness.
Mixup training improves robustness without degrading reconstruction.
Proposed scores effectively measure VAE robustness.
Abstract
In an unsupervised attack on variational autoencoders (VAEs), an adversary finds a small perturbation in an input sample that significantly changes its latent space encoding, thereby compromising the reconstruction for a fixed decoder. A known reason for such vulnerability is the distortions in the latent space resulting from a mismatch between approximated latent posterior and a prior distribution. Consequently, a slight change in an input sample can move its encoding to a low/zero density region in the latent space resulting in an unconstrained generation. This paper demonstrates that an optimal way for an adversary to attack VAEs is to exploit a directional bias of a stochastic pullback metric tensor induced by the encoder and decoder networks. The pullback metric tensor of an encoder measures the change in infinitesimal latent volume from an input to a latent space. Thus, it can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
