TL;DR
This paper explores how decoding techniques from neural response generation can be adapted to improve neural narrative generation, highlighting the effectiveness of nucleus sampling and mutual information objectives.
Contribution
It applies and evaluates recent decoding methods from response generation to narrative generation, providing insights into their effectiveness and limitations.
Findings
Nucleus sampling with thresholds 0.7-0.9 performs best.
Maximum mutual information improves story quality.
Automatic metrics poorly correlate with human judgments.
Abstract
Narrative generation is an open-ended NLP task in which a model generates a story given a prompt. The task is similar to neural response generation for chatbots; however, innovations in response generation are often not applied to narrative generation, despite the similarity between these tasks. We aim to bridge this gap by applying and evaluating advances in decoding methods for neural response generation to neural narrative generation. In particular, we employ GPT-2 and perform ablations across nucleus sampling thresholds and diverse decoding hyperparameters -- specifically, maximum mutual information -- analyzing results over multiple criteria with automatic and human evaluation. We find that (1) nucleus sampling is generally best with thresholds between 0.7 and 0.9; (2) a maximum mutual information objective can improve the quality of generated stories; and (3) established automatic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Cosine Annealing · Attention Is All You Need · Adam · Byte Pair Encoding · Softmax · Layer Normalization · Dense Connections · Multi-Head Attention · Linear Warmup With Cosine Annealing
