Hierarchical Neural Story Generation

Angela Fan; Mike Lewis; Yann Dauphin

arXiv:1805.04833·cs.CL·May 15, 2018·152 cites

Hierarchical Neural Story Generation

Angela Fan, Mike Lewis, Yann Dauphin

PDF

Open Access 5 Repos 3 Models 5 Datasets

TL;DR

This paper presents a hierarchical neural approach to story generation, utilizing a large dataset and novel model techniques to produce more coherent, relevant, and human-preferred stories compared to previous models.

Contribution

It introduces a hierarchical generation framework with a new model fusion method and gated multi-scale self-attention, significantly improving story relevance and coherence.

Findings

01

Large dataset of 300K stories enables effective training.

02

Hierarchical model outperforms non-hierarchical baselines.

03

Human judges prefer hierarchical stories by 2:1 ratio.

Abstract

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text. We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context. Experiments show large improvements over strong baselines on both automated and human evaluations. Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications