Self-Supervised Contrastive Learning with Adversarial Perturbations for   Defending Word Substitution-based Attacks

Zhao Meng; Yihan Dong; Mrinmaya Sachan; Roger Wattenhofer

arXiv:2107.07610·cs.CL·May 25, 2022

Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks

Zhao Meng, Yihan Dong, Mrinmaya Sachan, Roger Wattenhofer

PDF

Open Access 1 Repo

TL;DR

This paper introduces a self-supervised contrastive learning method using adversarial perturbations to enhance BERT's robustness against word substitution attacks without relying on labeled data.

Contribution

It proposes a novel adversarial contrastive learning approach that improves language model robustness without labeled data, outperforming traditional adversarial training methods.

Findings

01

Improves BERT robustness against four attack types

02

Combining with adversarial training yields higher robustness

03

Effective using only unlabeled large text datasets

Abstract

In this paper, we present an approach to improve the robustness of BERT language models against word substitution-based adversarial attacks by leveraging adversarial perturbations for self-supervised contrastive learning. We create a word-level adversarial attack generating hard positives on-the-fly as adversarial examples during contrastive learning. In contrast to previous works, our method improves model robustness without using any labeled data. Experimental results show that our method improves robustness of BERT against four different word substitution-based adversarial attacks, and combining our method with adversarial training gives higher robustness than adversarial training alone. As our method improves the robustness of BERT purely with unlabeled data, it opens up the possibility of using large text datasets to train robust language models against word substitution-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LotusDYH/ssl_robust
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Adversarial Robustness in Machine Learning

MethodsAttention Is All You Need · Linear Layer · Contrastive Learning · Weight Decay · Multi-Head Attention · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Residual Connection · Dense Connections · Softmax