Adversarial Distillation for Ordered Top-k Attacks

Zekun Zhang; Tianfu Wu

arXiv:1905.10695·cs.LG·May 28, 2019·1 cites

Adversarial Distillation for Ordered Top-k Attacks

Zekun Zhang, Tianfu Wu

PDF

Open Access

TL;DR

This paper introduces a novel adversarial distillation framework to generate ordered Top-k attacks on image classifiers, improving attack success rates over existing methods by leveraging label semantics and targeted distributions.

Contribution

It proposes a new adversarial distillation approach for ordered Top-k attacks, incorporating label semantic similarities to enhance attack effectiveness.

Findings

01

Outperforms C&W in Top-1 and Top-5 attack settings.

02

Significant improvements in attack success rates on ImageNet models.

03

Effective use of label semantics in adversarial attack generation.

Abstract

Deep Neural Networks (DNNs) are vulnerable to adversarial attacks, especially white-box targeted attacks. One scheme of learning attacks is to design a proper adversarial objective function that leads to the imperceptible perturbation for any test image (e.g., the Carlini-Wagner (C&W) method). Most methods address targeted attacks in the Top-1 manner. In this paper, we propose to learn ordered Top-k attacks (k>= 1) for image classification tasks, that is to enforce the Top-k predicted labels of an adversarial example to be the k (randomly) selected and ordered labels (the ground-truth label is exclusive). To this end, we present an adversarial distillation framework: First, we compute an adversarial probability distribution for any given ordered Top-k targeted labels with respect to the ground-truth of a test image. Then, we learn adversarial examples by minimizing the Kullback-Leibler…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Security and Verification in Computing