In and Out-of-Domain Text Adversarial Robustness via Label Smoothing
Yahan Yang, Soham Dan, Dan Roth, Insup Lee

TL;DR
This paper investigates how label smoothing enhances the adversarial robustness of NLP models like BERT across in-domain and out-of-domain tasks, showing it reduces overconfidence and improves resistance to attacks.
Contribution
It is the first to systematically evaluate label smoothing as a defense mechanism for NLP models against adversarial attacks in diverse settings.
Findings
Label smoothing significantly improves adversarial robustness.
Reduces over-confident errors on adversarial examples.
Enhances model stability across various NLP tasks.
Abstract
Recently it has been shown that state-of-the-art NLP models are vulnerable to adversarial attacks, where the predictions of a model can be drastically altered by slight modifications to the input (such as synonym substitutions). While several defense techniques have been proposed, and adapted, to the discrete nature of text adversarial attacks, the benefits of general-purpose regularization methods such as label smoothing for language models, have not been studied. In this paper, we study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks in both in-domain and out-of-domain settings. Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks. We also analyze the relationship between prediction confidence and robustness, showing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Attention Dropout · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Linear Warmup With Linear Decay
