Efficient Adversarial Training with Robust Early-Bird Tickets
Zhiheng Xi, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang

TL;DR
This paper introduces a novel efficient adversarial training method for pre-trained language models that leverages early emergence of robust subnetworks, significantly reducing training time while maintaining robustness.
Contribution
The authors discover robust early-bird tickets in early training phases and develop a structured sparsity search method to accelerate adversarial training.
Findings
Achieves 7 to 13 times faster training speed.
Maintains or improves robustness compared to state-of-the-art methods.
Automatically terminates ticket search with a convergence metric.
Abstract
Adversarial training is one of the most powerful methods to improve the robustness of pre-trained language models (PLMs). However, this approach is typically more expensive than traditional fine-tuning because of the necessity to generate adversarial examples via gradient descent. Delving into the optimization process of adversarial training, we find that robust connectivity patterns emerge in the early training phase (typically epochs), far before parameters converge. Inspired by this finding, we dig out robust early-bird tickets (i.e., subnetworks) to develop an efficient adversarial training method: (1) searching for robust tickets with structured sparsity in the early stage; (2) fine-tuning robust tickets in the remaining time. To extract the robust tickets as early as possible, we design a ticket convergence metric to automatically terminate the searching process.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Natural Language Processing Techniques
