Removing Adversarial Noise in Class Activation Feature Space

Dawei Zhou; Nannan Wang; Chunlei Peng; Xinbo Gao; Xiaoyu Wang; Jun Yu,; Tongliang Liu

arXiv:2104.09197·cs.LG·April 20, 2021

Removing Adversarial Noise in Class Activation Feature Space

Dawei Zhou, Nannan Wang, Chunlei Peng, Xinbo Gao, Xiaoyu Wang, Jun Yu,, Tongliang Liu

PDF

Open Access

TL;DR

This paper introduces a novel self-supervised adversarial training method in class activation feature space to effectively remove adversarial noise, improving robustness against various attacks.

Contribution

It proposes a new training mechanism that enhances adversarial robustness by denoising in class activation feature space, addressing error amplification issues.

Findings

01

Significantly improves robustness against unseen attacks

02

Effective against adaptive adversarial attacks

03

Outperforms previous state-of-the-art methods

Abstract

Deep neural networks (DNNs) are vulnerable to adversarial noise. Preprocessing based defenses could largely remove adversarial noise by processing inputs. However, they are typically affected by the error amplification effect, especially in the front of continuously evolving attacks. To solve this problem, in this paper, we propose to remove adversarial noise by implementing a self-supervised adversarial training mechanism in a class activation feature space. To be specific, we first maximize the disruptions to class activation features of natural examples to craft adversarial examples. Then, we train a denoising model to minimize the distances between the adversarial examples and the natural examples in the class activation feature space. Empirical evaluations demonstrate that our method could significantly enhance adversarial robustness in comparison to previous state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications