Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training
Dawei Zhou, Nannan Wang, Xinbo Gao, Bo Han, Jun Yu, Xiaoyu Wang,, Tongliang Liu

TL;DR
This paper introduces JATP, a joint adversarial training method for input pre-processing defenses that enhances white-box robustness of DNNs against adversarial noise by addressing robustness degradation issues.
Contribution
The paper proposes a novel joint adversarial training approach for pre-processing defenses, incorporating feature similarity-based adversarial risk and pixel-wise loss to improve robustness.
Findings
JATP effectively mitigates robustness degradation in white-box settings.
The method improves robustness across different target models.
Empirical results outperform previous state-of-the-art defenses.
Abstract
Deep neural networks (DNNs) are vulnerable to adversarial noise. A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise, among which the input pre-processing methods are scalable and show great potential to safeguard DNNs. However, pre-processing methods may suffer from the robustness degradation effect, in which the defense reduces rather than improving the adversarial robustness of a target model in a white-box setting. A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model. To solve this problem, we investigate the influence of full adversarial examples which are crafted against the full model, and find they indeed have a positive impact on the robustness of defenses. Furthermore, we find that simply changing the adversarial training examples in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
