DifAttack: Query-Efficient Black-Box Attack via Disentangled Feature   Space

Liu Jun; Zhou Jiantao; Zeng Jiandian; Jinyu Tian

arXiv:2309.14585·cs.CV·December 14, 2023

DifAttack: Query-Efficient Black-Box Attack via Disentangled Feature Space

Liu Jun, Zhou Jiantao, Zeng Jiandian, Jinyu Tian

PDF

Open Access 1 Repo

TL;DR

DifAttack introduces a novel black-box adversarial attack method that disentangles image features into adversarial and visual components, enabling efficient and effective attacks without relying on surrogate model gradients.

Contribution

The paper proposes a new attack approach based on disentangled feature space, improving query efficiency and success rate in black-box scenarios, especially in open-set conditions.

Findings

01

Achieves higher attack success rates compared to existing methods.

02

Requires fewer queries to generate successful adversarial examples.

03

Performs well in targeted and open-set attack scenarios.

Abstract

This work investigates efficient score-based black-box adversarial attacks with a high Attack Success Rate (ASR) and good generalizability. We design a novel attack method based on a Disentangled Feature space, called DifAttack, which differs significantly from the existing ones operating over the entire feature space. Specifically, DifAttack firstly disentangles an image's latent feature into an adversarial feature and a visual feature, where the former dominates the adversarial capability of an image, while the latter largely determines its visual appearance. We train an autoencoder for the disentanglement by using pairs of clean images and their Adversarial Examples (AEs) generated from available surrogate models via white-box attack methods. Eventually, DifAttack iteratively optimizes the adversarial feature according to the query feedback from the victim model until a successful AE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

csjunjun/difattack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

MethodsAutoencoders