Breaking Writer's Block: Low-cost Fine-tuning of Natural Language   Generation Models

Alexandre Duval; Thomas Lamson; Gael de Leseleuc de Kerouara and; Matthias Gall\'e

arXiv:2101.03216·cs.CL·March 3, 2021

Breaking Writer's Block: Low-cost Fine-tuning of Natural Language Generation Models

Alexandre Duval, Thomas Lamson, Gael de Leseleuc de Kerouara and, Matthias Gall\'e

PDF

TL;DR

This paper presents a low-cost, fine-tuning approach for natural language generation models to assist writers in overcoming writer's block, incorporating context, entities, and metadata for improved output.

Contribution

It introduces a novel fine-tuning method for generation models that is cost-effective and effective with minimal epochs, enhancing controlled text generation for writers.

Findings

01

Achieved excellent results with minimal training epochs.

02

Cost of fine-tuning is approximately USD 150.

03

System is accessible as a web-service with open-source code.

Abstract

It is standard procedure these days to solve Information Extraction task by fine-tuning large pre-trained language models. This is not the case for generation task, which relies on a variety of techniques for controlled language generation. In this paper, we describe a system that fine-tunes a natural language generation model for the problem of solving Writer's Block. The fine-tuning changes the conditioning to also include the right context in addition to the left context, as well as an optional list of entities, the size, the genre and a summary of the paragraph that the human author wishes to generate. Our proposed fine-tuning obtains excellent results, even with a small number of epochs and a total cost of USD 150. The system can be accessed as a web-service, and all the code is released. A video showcasing the interface and the model is also available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Cosine Annealing · Weight Decay · Discriminative Fine-Tuning · Linear Warmup With Cosine Annealing · Dense Connections · Byte Pair Encoding · Multi-Head Attention · Attention Is All You Need · Dropout