MockingBERT: A Method for Retroactively Adding Resilience to NLP Models
Jan Jezabek, Akash Singh

TL;DR
MockingBERT introduces a retroactive method to enhance NLP model robustness against misspellings without re-training, using an efficient adversarial misspelling generation technique that maintains core language understanding.
Contribution
The paper presents a novel approach to add resilience to NLP models against misspellings without re-training and introduces an efficient adversarial misspelling generator.
Findings
Resilience achieved with minimal accuracy loss on clean inputs.
Significant reduction in evaluation cost for adversarial robustness.
Method applicable to transformer-based NLP models.
Abstract
Protecting NLP models against misspellings whether accidental or adversarial has been the object of research interest for the past few years. Existing remediations have typically either compromised accuracy or required full model re-training with each new class of attacks. We propose a novel method of retroactively adding resilience to misspellings to transformer-based NLP models. This robustness can be achieved without the need for re-training of the original NLP model and with only a minimal loss of language understanding performance on inputs without misspellings. Additionally we propose a new efficient approximate method of generating adversarial misspellings, which significantly reduces the cost needed to evaluate a model's resilience to adversarial attacks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling
