A Context Aware Approach for Generating Natural Language Attacks

Rishabh Maheshwary; Saket Maheshwary; Vikram Pudi

arXiv:2012.13339·cs.CL·December 25, 2020

A Context Aware Approach for Generating Natural Language Attacks

Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a context-aware method for generating high-quality adversarial examples to attack NLP models, improving success rates and minimizing perturbations by leveraging language models and context understanding.

Contribution

It presents a novel attack strategy that considers context and uses masked language modeling and next sentence prediction to craft effective adversarial examples.

Findings

01

Higher success rate in attacking NLP models

02

Reduced word perturbation compared to prior methods

03

Effective in text classification and entailment tasks

Abstract

We study an important task of attacking natural language processing models in a black box setting. We propose an attack strategy that crafts semantically similar adversarial examples on text classification and entailment tasks. Our proposed attack finds candidate words by considering the information of both the original word and its surrounding context. It jointly leverages masked language modelling and next sentence prediction for context understanding. In comparison to attacks proposed in prior literature, we are able to generate high quality adversarial examples that do significantly better both in terms of success rate and word perturbation percentage.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RishabhMaheshwary/contextattack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Hate Speech and Cyberbullying Detection