Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks
Zhao Meng, Yihan Dong, Mrinmaya Sachan, Roger Wattenhofer

TL;DR
This paper introduces a self-supervised contrastive learning method using adversarial perturbations to enhance BERT's robustness against word substitution attacks without relying on labeled data.
Contribution
It proposes a novel adversarial contrastive learning approach that improves language model robustness without labeled data, outperforming traditional adversarial training methods.
Findings
Improves BERT robustness against four attack types
Combining with adversarial training yields higher robustness
Effective using only unlabeled large text datasets
Abstract
In this paper, we present an approach to improve the robustness of BERT language models against word substitution-based adversarial attacks by leveraging adversarial perturbations for self-supervised contrastive learning. We create a word-level adversarial attack generating hard positives on-the-fly as adversarial examples during contrastive learning. In contrast to previous works, our method improves model robustness without using any labeled data. Experimental results show that our method improves robustness of BERT against four different word substitution-based adversarial attacks, and combining our method with adversarial training gives higher robustness than adversarial training alone. As our method improves the robustness of BERT purely with unlabeled data, it opens up the possibility of using large text datasets to train robust language models against word substitution-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Adversarial Robustness in Machine Learning
MethodsAttention Is All You Need · Linear Layer · Contrastive Learning · Weight Decay · Multi-Head Attention · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Residual Connection · Dense Connections · Softmax
