Structured Adversarial Attack: Towards General Implementation and Better   Interpretability

Kaidi Xu; Sijia Liu; Pu Zhao; Pin-Yu Chen; Huan Zhang; Quanfu Fan,; Deniz Erdogmus; Yanzhi Wang; Xue Lin

arXiv:1808.01664·cs.LG·February 21, 2019·104 cites

Structured Adversarial Attack: Towards General Implementation and Better Interpretability

Kaidi Xu, Sijia Liu, Pu Zhao, Pin-Yu Chen, Huan Zhang, Quanfu Fan,, Deniz Erdogmus, Yanzhi Wang, Xue Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a structured adversarial attack model, StrAttack, that leverages group sparsity and an ADMM framework to generate more interpretable adversarial examples while maintaining effectiveness across multiple datasets.

Contribution

The work proposes a novel structured attack model using group sparsity and an ADMM-based framework, enhancing interpretability and generality of adversarial attacks.

Findings

01

StrAttack achieves strong group sparsity in perturbations.

02

It maintains similar Lp norm distortion as state-of-the-art attacks.

03

Provides better interpretability via saliency and activation maps.

Abstract

When generating adversarial examples to attack deep neural networks (DNNs), Lp norm of the added perturbation is usually used to measure the similarity between original image and adversarial example. However, such adversarial attacks perturbing the raw input spaces may fail to capture structural information hidden in the input. This work develops a more general attack model, i.e., the structured attack (StrAttack), which explores group sparsity in adversarial perturbations by sliding a mask through images aiming for extracting key spatial structures. An ADMM (alternating direction method of multipliers)-based framework is proposed that can split the original problem into a sequence of analytically solvable subproblems and can be generalized to implement other attacking methods. Strong group sparsity is achieved in adversarial perturbations even with the same level of Lp norm distortion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KaidiXu/StrAttack
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

MethodsInterpretability · Alternating Direction Method of Multipliers