TL;DR
This paper introduces a novel method that enhances adversarial robustness in neural networks by enforcing local and global compactness and clustering, leading to improved resistance against adversarial attacks.
Contribution
It proposes the Adversary Divergence Reduction Network, which integrates local/global compactness and clustering into adversarial training for better robustness.
Findings
Enhanced robustness against adversarial attacks.
Improved unperturbed and adversarial accuracy.
Outperforms existing adversarial training methods.
Abstract
The fact that deep neural networks are susceptible to crafted perturbations severely impacts the use of deep learning in certain domains of application. Among many developed defense models against such attacks, adversarial training emerges as the most successful method that consistently resists a wide range of attacks. In this work, based on an observation from a previous study that the representations of a clean data example and its adversarial examples become more divergent in higher layers of a deep neural net, we propose the Adversary Divergence Reduction Network which enforces local/global compactness and the clustering assumption over an intermediate layer of a deep neural network. We conduct comprehensive experiments to understand the isolating behavior of each component (i.e., local/global compactness and the clustering assumption) and compare our proposed model with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
