Tailor: Generating and Perturbing Text with Semantic Controls

Alexis Ross; Tongshuang Wu; Hao Peng; Matthew E. Peters; Matt Gardner

arXiv:2107.07150·cs.CL·March 21, 2022

Tailor: Generating and Perturbing Text with Semantic Controls

Alexis Ross, Tongshuang Wu, Hao Peng, Matthew E. Peters, Matt Gardner

PDF

Open Access 1 Repo 1 Models

TL;DR

Tailor is a semantically-controlled text generation system that creates targeted perturbations to evaluate and improve NLP model robustness, using a pretrained seq2seq model and controllable semantic operations.

Contribution

It introduces a flexible, semantic control-based perturbation method that enhances contrast set creation and data augmentation for NLP tasks.

Findings

01

Generated high-quality contrast sets with fewer artifacts.

02

Perturbing 2% of training data improved NLI model generalization.

03

Effective in multiple NLP applications.

Abstract

Controlled text perturbation is useful for evaluating and improving model generalizability. However, current techniques rely on training a model for every target perturbation, which is expensive and hard to generalize. We present Tailor, a semantically-controlled text generation system. Tailor builds on a pretrained seq2seq model and produces textual outputs conditioned on control codes derived from semantic representations. We craft a set of operations to modify the control codes, which in turn steer generation towards targeted attributes. These operations can be further composed into higher-level ones, allowing for flexible perturbation strategies. We demonstrate the effectiveness of these perturbations in multiple applications. First, we use Tailor to automatically create high-quality contrast sets for four distinct natural language processing (NLP) tasks. These contrast sets contain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

allenai/tailor
noneOfficial

Models

🤗
allenai/tailor
model· 17 dl· ♡ 2
17 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence