TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven   Optimization

Bairu Hou; Jinghan Jia; Yihua Zhang; Guanhua Zhang; Yang Zhang; Sijia; Liu; Shiyu Chang

arXiv:2212.09254·cs.CL·December 20, 2022·1 cites

TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization

Bairu Hou, Jinghan Jia, Yihua Zhang, Guanhua Zhang, Yang Zhang, Sijia, Liu, Shiyu Chang

PDF

Open Access 1 Repo 1 Video

TL;DR

TextGrad introduces a gradient-driven optimization framework for generating high-quality adversarial examples in NLP, addressing unique challenges of discrete text and fluency constraints to improve robustness evaluation and defense.

Contribution

It presents the first gradient-based attack generator for NLP that effectively handles discrete text and fluency constraints, enhancing robustness assessment and adversarial training.

Findings

01

TextGrad achieves high attack success rates in NLP robustness evaluation.

02

Incorporating TextGrad into adversarial training improves model robustness.

03

Extensive experiments validate the effectiveness of TextGrad in attack and defense scenarios.

Abstract

Robustness evaluation against adversarial examples has become increasingly important to unveil the trustworthiness of the prevailing deep models in natural language processing (NLP). However, in contrast to the computer vision domain where the first-order projected gradient descent (PGD) is used as the benchmark approach to generate adversarial examples for robustness evaluation, there lacks a principled first-order gradient-based robustness evaluation framework in NLP. The emerging optimization challenges lie in 1) the discrete nature of textual inputs together with the strong coupling between the perturbation location and the actual content, and 2) the additional constraint that the perturbed text should be fluent and achieve a low perplexity under a language model. These challenges make the development of PGD-like NLP attacks difficult. To bridge the gap, we propose TextGrad, a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ucsb-nlp-chang/textgrad
pytorchOfficial

Videos

TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning