Fixing Model Bugs with Natural Language Patches
Shikhar Murty, Christopher D. Manning, Scott Lundberg, Marco Tulio, Ribeiro

TL;DR
This paper introduces a method for fixing NLP model bugs using natural language patches, which are declarative corrections that improve model accuracy with minimal data and effort.
Contribution
It proposes modeling patch applicability separately from integration, demonstrating effective use of natural language patches to fix models with few examples.
Findings
Natural language patches improve accuracy by 1-4 points on sentiment analysis.
F1 score increases by 7 points on relation extraction.
Small synthetic datasets suffice to teach models to use real patches effectively.
Abstract
Current approaches for fixing systematic problems in NLP models (e.g. regex patches, finetuning on more data) are either brittle, or labor-intensive and liable to shortcuts. In contrast, humans often provide corrections to each other through natural language. Taking inspiration from this, we explore natural language patches -- declarative statements that allow developers to provide corrective feedback at the right level of abstraction, either overriding the model (``if a review gives 2 stars, the sentiment is negative'') or providing additional information the model may lack (``if something is described as the bomb, then it is good''). We model the task of determining if a patch applies separately from the task of integrating patch information, and show that with a small amount of synthetic data, we can teach models to effectively use real patches on real data -- 1 to 7 patches improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
