Adversarial Training for Large Neural Language Models
Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung, Poon, Jianfeng Gao

TL;DR
This paper introduces ALUM, an adversarial training algorithm for large neural language models that improves both their generalization and robustness across various NLP tasks and training stages.
Contribution
The paper presents the first comprehensive study of adversarial pre-training for large language models, demonstrating significant improvements over existing methods like BERT and RoBERTa.
Findings
ALUM improves model performance on a wide range of NLP tasks.
Adversarial pre-training enhances robustness against adversarial attacks.
ALUM yields additional gains when combined with task-specific fine-tuning.
Abstract
Generalization and robustness are both key desiderata for designing machine learning methods. Adversarial training can enhance robustness, but past work often finds it hurts generalization. In natural language processing (NLP), pre-training large neural language models such as BERT have demonstrated impressive gain in generalization for a variety of tasks, with further improvement from adversarial fine-tuning. However, these models are still vulnerable to adversarial attacks. In this paper, we show that adversarial pre-training can improve both generalization and robustness. We propose a general algorithm ALUM (Adversarial training for large neural LangUage Models), which regularizes the training objective by applying perturbations in the embedding space that maximizes the adversarial loss. We present the first comprehensive study of adversarial training in all stages, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Natural Language Processing Techniques
MethodsLinear Layer · RoBERTa · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece
