Towards Improving Adversarial Training of NLP Models
Jin Yong Yoo, Yanjun Qi

TL;DR
This paper introduces A2T, a simple and efficient adversarial training method for NLP models that enhances robustness, accuracy, and generalization by using a novel word substitution attack during training.
Contribution
It proposes a new, computationally cheaper adversarial attack (A2T) for NLP, improving vanilla adversarial training effectiveness and model robustness.
Findings
A2T improves robustness against word substitution attacks.
A2T enhances standard accuracy and cross-domain generalization.
A2T is computationally cheaper than existing methods.
Abstract
Adversarial training, a method for learning robust deep neural networks, constructs adversarial examples during training. However, recent methods for generating NLP adversarial examples involve combinatorial search and expensive sentence encoders for constraining the generated instances. As a result, it remains challenging to use vanilla adversarial training to improve NLP models' performance, and the benefits are mainly uninvestigated. This paper proposes a simple and improved vanilla adversarial training process for NLP models, which we name Attacking to Training (A2T). The core part of A2T is a new and cheaper word substitution attack optimized for vanilla adversarial training. We use A2T to train BERT and RoBERTa models on IMDB, Rotten Tomatoes, Yelp, and SNLI datasets. Our results empirically show that it is possible to train robust NLP models using a much cheaper adversary. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)
MethodsAttention Is All You Need · Linear Layer · Dropout · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Multi-Head Attention · WordPiece · Dense Connections · Linear Warmup With Linear Decay
