TL;DR
This paper presents CLAF, a training framework combining contrastive learning and adversarial training to enhance neural network robustness against attacks while preserving high accuracy on clean data.
Contribution
It introduces a novel approach that mitigates cognitive dissociation by updating the classification head during contrastive adversarial training, improving robustness and accuracy.
Findings
CLAF outperforms existing methods on CIFAR-10 in robust accuracy.
It maintains high clean accuracy alongside adversarial robustness.
The method effectively reduces cognitive dissociation between embedding and classification head.
Abstract
In this paper, we introduce a novel neural network training framework that increases model's adversarial robustness to adversarial attacks while maintaining high clean accuracy by combining contrastive learning (CL) with adversarial training (AT). We propose to improve model robustness to adversarial attacks by learning feature representations that are consistent under both data augmentations and adversarial perturbations. We leverage contrastive learning to improve adversarial robustness by considering an adversarial example as another positive example, and aim to maximize the similarity between random augmentations of data samples and their adversarial example, while constantly updating the classification head in order to avoid a cognitive dissociation between the classification head and the embedding space. This dissociation is caused by the fact that CL updates the network up to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsContrastive Learning
