Adv-BERT: BERT is not robust on misspellings! Generating nature   adversarial samples on BERT

Lichao Sun; Kazuma Hashimoto; Wenpeng Yin; Akari Asai; Jia Li; Philip; Yu; Caiming Xiong

arXiv:2003.04985·cs.CL·March 12, 2020·80 cites

Adv-BERT: BERT is not robust on misspellings! Generating nature adversarial samples on BERT

Lichao Sun, Kazuma Hashimoto, Wenpeng Yin, Akari Asai, Jia Li, Philip, Yu, Caiming Xiong

PDF

Open Access

TL;DR

This paper investigates BERT's robustness to natural, inadvertent typos in text, revealing that typos in key words significantly impair performance, highlighting differences between human and machine recognition of such errors.

Contribution

It systematically analyzes BERT's vulnerability to natural typos, providing insights into which errors are most damaging and how models differ from humans in handling them.

Findings

01

Typos in informative words cause more damage.

02

Mistypes are more harmful than insertions or deletions.

03

Humans and machines focus differently on adversarial errors.

Abstract

There is an increasing amount of literature that claims the brittleness of deep neural networks in dealing with adversarial examples that are created maliciously. It is unclear, however, how the models will perform in realistic scenarios where \textit{natural rather than malicious} adversarial instances often exist. This work systematically explores the robustness of BERT, the state-of-the-art Transformer-style model in NLP, in dealing with noisy data, particularly mistakes in typing the keyboard, that occur inadvertently. Intensive experiments on sentiment analysis and question answering benchmarks indicate that: (i) Typos in various words of a sentence do not influence equally. The typos in informative words make severer damages; (ii) Mistype is the most damaging factor, compared with inserting, deleting, etc.; (iii) Humans and machines have different focuses on recognizing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Misinformation and Its Impacts

MethodsLinear Layer · Weight Decay · Residual Connection · Adam · Layer Normalization · Softmax · Attention Is All You Need · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention