Future Sight: Dynamic Story Generation with Large Pretrained Language Models
Brian D. Zimmerman, Gaurav Sahu, Olga Vechtomova

TL;DR
Future Sight introduces a method for fine-tuning pretrained transformers to condition story generation on future plot points, enabling more controllable and coherent narratives guided by human input.
Contribution
The paper presents a novel fine-tuning approach allowing transformers to attend to future plot events, enhancing controllability in story generation tasks.
Findings
Improves coherence when conditioning on future plot points
Enables human-guided narrative steering during generation
Achieves positive evaluations from human judges
Abstract
Recent advances in deep learning research, such as transformers, have bolstered the ability for automated agents to generate creative texts similar to those that a human would write. By default, transformer decoders can only generate new text with respect to previously generated text. The output distribution of candidate tokens at any position is conditioned on previously selected tokens using a self-attention mechanism to emulate the property of autoregression. This is inherently limiting for tasks such as controllable story generation where it may be necessary to condition on future plot events when writing a story. In this work, we propose Future Sight, a method for finetuning a pretrained generative transformer on the task of future conditioning. Transformer decoders are typically pretrained on the task of completing a context, one token at a time, by means of self-attention. Future…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Artificial Intelligence in Games
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Dense Connections · Residual Connection · Label Smoothing · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Dropout
