Context-aware Adversarial Attack on Named Entity Recognition
Shuguang Chen, Leonardo Neves, and Thamar Solorio

TL;DR
This paper introduces a context-aware adversarial attack method on named entity recognition models, perturbing key words to evaluate and demonstrate their vulnerability to more natural and effective adversarial examples.
Contribution
It proposes a novel approach to generate natural, plausible adversarial examples by perturbing informative words in NER tasks, revealing model vulnerabilities.
Findings
Our methods outperform baselines in deceiving NER models.
Perturbing key words significantly reduces model accuracy.
The approach produces more natural adversarial examples.
Abstract
In recent years, large pre-trained language models (PLMs) have achieved remarkable performance on many natural language processing benchmarks. Despite their success, prior studies have shown that PLMs are vulnerable to attacks from adversarial examples. In this work, we focus on the named entity recognition task and study context-aware adversarial attack methods to examine the model's robustness. Specifically, we propose perturbing the most informative words for recognizing entities to create adversarial examples and investigate different candidate replacement methods to generate natural and plausible adversarial examples. Experiments and analyses show that our methods are more effective in deceiving the model into making wrong predictions than strong baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsFocus
