Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, Daniel S. Weld

TL;DR
Polyjuice is a versatile GPT-2 based counterfactual generator that creates diverse, realistic examples to enhance NLP model training, evaluation, explanation, and error analysis with less manual effort.
Contribution
It introduces a controllable, general-purpose counterfactual generation method trained on multiple datasets, surpassing manual and limited perturbation approaches.
Findings
Reduces annotation effort by around 70% in training and evaluation.
Enhances explanation techniques with augmented counterfactuals.
Facilitates systematic error analysis revealing overlooked behaviors.
Abstract
While counterfactual examples are useful for analysis and training of NLP models, current generation methods either rely on manual labor to create very few counterfactuals, or only instantiate limited types of perturbations such as paraphrases or word substitutions. We present Polyjuice, a general-purpose counterfactual generator that allows for control over perturbation types and locations, trained by finetuning GPT-2 on multiple datasets of paired sentences. We show that Polyjuice produces diverse sets of realistic counterfactuals, which in turn are useful in various distinct applications: improving training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting state-of-the-art explanation techniques, and supporting systematic counterfactual error analysis by revealing behaviors easily missed by human experts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques
MethodsLinear Layer · Cosine Annealing · Weight Decay · Discriminative Fine-Tuning · Softmax · Dropout · Byte Pair Encoding · Dense Connections · Linear Warmup With Cosine Annealing · Multi-Head Attention
