Adversarial examples by perturbing high-level features in intermediate   decoder layers

Vojt\v{e}ch \v{C}erm\'ak; Luk\'a\v{s} Adam

arXiv:2110.07182·cs.CV·October 15, 2021

Adversarial examples by perturbing high-level features in intermediate decoder layers

Vojt\v{e}ch \v{C}erm\'ak, Luk\'a\v{s} Adam

PDF

Open Access

TL;DR

This paper introduces a novel adversarial attack method that perturbs high-level features in intermediate decoder layers of generative models, producing semantically meaningful adversarial images that are more robust against defenses.

Contribution

It presents a new approach to generate adversarial examples by perturbing intermediate decoder features, leveraging Wasserstein distance optimization, and demonstrates its effectiveness on MNIST and ImageNet datasets.

Findings

01

Adversarial images are less vulnerable to steganographic defenses.

02

The method modifies key features like edges and colors.

03

Defense techniques based on adversarial training are vulnerable.

Abstract

We propose a novel method for creating adversarial examples. Instead of perturbing pixels, we use an encoder-decoder representation of the input image and perturb intermediate layers in the decoder. This changes the high-level features provided by the generative model. Therefore, our perturbation possesses semantic meaning, such as a longer beak or green tints. We formulate this task as an optimization problem by minimizing the Wasserstein distance between the adversarial and initial images under a misclassification constraint. We employ the projected gradient method with a simple inexact projection. Due to the projection, all iterations are feasible, and our method always generates adversarial images. We perform numerical experiments on the MNIST and ImageNet datasets in both targeted and untargeted settings. We demonstrate that our adversarial images are much less vulnerable to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis