GeDi: Generative Discriminator Guided Sequence Generation

Ben Krause; Akhilesh Deepak Gotmare; Bryan McCann; Nitish Shirish; Keskar; Shafiq Joty; Richard Socher; Nazneen Fatema Rajani

arXiv:2009.06367·cs.CL·October 23, 2020·6 cites

GeDi: Generative Discriminator Guided Sequence Generation

Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish, Keskar, Shafiq Joty, Richard Socher, Nazneen Fatema Rajani

PDF

Open Access 3 Repos

TL;DR

GeDi is a method that uses smaller discriminative models to guide large language models for safer, more controllable text generation, achieving faster speeds and zero-shot topic control.

Contribution

Introduces GeDi, a novel approach that guides large language models with smaller discriminators for improved safety and controllability, including zero-shot topic control.

Findings

01

GeDi outperforms state-of-the-art controllability methods.

02

Generation speed is more than 30 times faster with GeDi.

03

Successfully reduces toxicity in GPT-2 without losing linguistic quality.

Abstract

While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate. This is especially problematic because datasets used for training large LMs usually contain significant toxicity, hate, bias, and negativity. We propose GeDi as an efficient method for using smaller LMs as generative discriminators to guide generation from large LMs to make them safer and more controllable. GeDi guides generation at each step by computing classification probabilities for all possible next tokens via Bayes rule by normalizing over two class-conditional distributions; one conditioned on the desired attribute, or control code, and another conditioned on the undesired attribute, or anti control code. We find that GeDi gives stronger controllability than the state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection

MethodsLinear Layer · Cosine Annealing · Layer Normalization · Weight Decay · Dropout · Dense Connections · Linear Warmup With Cosine Annealing · Attention Dropout · Byte Pair Encoding · Multi-Head Attention