In and Out-of-Domain Text Adversarial Robustness via Label Smoothing

Yahan Yang; Soham Dan; Dan Roth; Insup Lee

arXiv:2212.10258·cs.CL·July 13, 2023

In and Out-of-Domain Text Adversarial Robustness via Label Smoothing

Yahan Yang, Soham Dan, Dan Roth, Insup Lee

PDF

Open Access

TL;DR

This paper investigates how label smoothing enhances the adversarial robustness of NLP models like BERT across in-domain and out-of-domain tasks, showing it reduces overconfidence and improves resistance to attacks.

Contribution

It is the first to systematically evaluate label smoothing as a defense mechanism for NLP models against adversarial attacks in diverse settings.

Findings

01

Label smoothing significantly improves adversarial robustness.

02

Reduces over-confident errors on adversarial examples.

03

Enhances model stability across various NLP tasks.

Abstract

Recently it has been shown that state-of-the-art NLP models are vulnerable to adversarial attacks, where the predictions of a model can be drastically altered by slight modifications to the input (such as synonym substitutions). While several defense techniques have been proposed, and adapted, to the discrete nature of text adversarial attacks, the benefits of general-purpose regularization methods such as label smoothing for language models, have not been studied. In this paper, we study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks in both in-domain and out-of-domain settings. Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks. We also analyze the relationship between prediction confidence and robustness, showing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Attention Dropout · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Linear Warmup With Linear Decay