Breaking BERT: Understanding its Vulnerabilities for Named Entity Recognition through Adversarial Attack
Anne Dirkson, Suzan Verberne, Wessel Kraaij

TL;DR
This paper investigates the vulnerabilities of BERT-based models for Named Entity Recognition by applying adversarial attacks, revealing significant sensitivity to local context changes and emergent entities, especially in domain-specific models.
Contribution
It provides a comprehensive analysis of BERT models' vulnerabilities to adversarial input variations in NER tasks, highlighting the need for robustness improvements.
Findings
BERT models are highly vulnerable to local context changes in NER.
Single modifications can often fool the models.
Domain-specific models like SciBERT are more vulnerable than general BERT.
Abstract
Both generic and domain-specific BERT models are widely used for natural language processing (NLP) tasks. In this paper we investigate the vulnerability of BERT models to variation in input data for Named Entity Recognition (NER) through adversarial attack. Experimental results show that BERT models are vulnerable to variation in the entity context with 20.2 to 45.0% of entities predicted completely wrong and another 29.3 to 53.3% of entities predicted wrong partially. BERT models seem most vulnerable to changes in the local context of entities and often a single change is sufficient to fool the model. The domain-specific BERT model trained from scratch (SciBERT) is more vulnerable than the original BERT model or the domain-specific model that retains the BERT vocabulary (BioBERT). We also find that BERT models are particularly vulnerable to emergent entities. Our results chart the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Softmax · Weight Decay · Residual Connection · Layer Normalization · WordPiece
