Uniform Complexity for Text Generation

Joseph Marvin Imperial; Harish Tayyar Madabushi

arXiv:2204.05185·cs.CL·October 23, 2023

Uniform Complexity for Text Generation

Joseph Marvin Imperial, Harish Tayyar Madabushi

PDF

Open Access 1 Repo

TL;DR

This paper introduces UCTG, a benchmark for evaluating and improving the consistency of text complexity in generative NLP models, highlighting current models' struggles to maintain input prompt complexity.

Contribution

The paper presents a new benchmark, UCTG, and evaluates over 150 features to assess and enhance the uniformity of linguistic complexity in text generation.

Findings

01

Models like GPT-2 struggle to preserve input complexity.

02

Finetuning with professional texts does not fully solve complexity preservation.

03

The benchmark enables systematic evaluation of complexity consistency.

Abstract

Large language models (LLMs) have shown promising results in a wide array of generative NLP tasks, such as summarization and machine translation. In the context of narrative generation, however, existing models still do not capture factors that contribute to producing consistent text. For instance, it is logical that a piece of text or a story should be uniformly readable throughout and that this form of complexity should be controllable. As such, if the complexity of an input text prompt is rated first-grade reading level in the Flesch Reading Ease test, then the generated text continuing the plot should also be within this range of complexity. With this in mind, we introduce Uniform Complexity for Text Generation (UCTG), a new benchmark test which raises the challenge of making generative models observe uniform linguistic properties with respect to prompts. We experiment with over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

imperialite/uniform-complexity-textgen
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Games

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Byte Pair Encoding · Dense Connections · Attention Dropout · Linear Warmup With Cosine Annealing · Weight Decay