Plug and Play Language Models: A Simple Approach to Controlled Text   Generation

Sumanth Dathathri; Andrea Madotto; Janice Lan; Jane Hung; Eric Frank,; Piero Molino; Jason Yosinski; Rosanne Liu

arXiv:1912.02164·cs.CL·March 4, 2020·130 cites

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank,, Piero Molino, Jason Yosinski, Rosanne Liu

PDF

Open Access 5 Repos

TL;DR

The paper introduces Plug and Play Language Models (PPLM), a simple method for controlling text generation attributes by guiding a pretrained language model with lightweight attribute classifiers without retraining.

Contribution

It presents a novel, training-free approach to steer language models using attribute classifiers, enabling flexible and controlled text generation.

Findings

01

Effective control over topics and sentiment demonstrated.

02

High attribute alignment and fluency in generated text.

03

Flexible combination of attribute models for diverse applications.

Abstract

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications