Decoding Methods for Neural Narrative Generation

Alexandra DeLucia; Aaron Mueller; Xiang Lisa Li; Jo\~ao Sedoc

arXiv:2010.07375·cs.CL·July 9, 2021

Decoding Methods for Neural Narrative Generation

Alexandra DeLucia, Aaron Mueller, Xiang Lisa Li, Jo\~ao Sedoc

PDF

2 Repos

TL;DR

This paper explores how decoding techniques from neural response generation can be adapted to improve neural narrative generation, highlighting the effectiveness of nucleus sampling and mutual information objectives.

Contribution

It applies and evaluates recent decoding methods from response generation to narrative generation, providing insights into their effectiveness and limitations.

Findings

01

Nucleus sampling with thresholds 0.7-0.9 performs best.

02

Maximum mutual information improves story quality.

03

Automatic metrics poorly correlate with human judgments.

Abstract

Narrative generation is an open-ended NLP task in which a model generates a story given a prompt. The task is similar to neural response generation for chatbots; however, innovations in response generation are often not applied to narrative generation, despite the similarity between these tasks. We aim to bridge this gap by applying and evaluating advances in decoding methods for neural response generation to neural narrative generation. In particular, we employ GPT-2 and perform ablations across nucleus sampling thresholds and diverse decoding hyperparameters -- specifically, maximum mutual information -- analyzing results over multiple criteria with automatic and human evaluation. We find that (1) nucleus sampling is generally best with thresholds between 0.7 and 0.9; (2) a maximum mutual information objective can improve the quality of generated stories; and (3) established automatic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Cosine Annealing · Attention Is All You Need · Adam · Byte Pair Encoding · Softmax · Layer Normalization · Dense Connections · Multi-Head Attention · Linear Warmup With Cosine Annealing