Uniform Complexity for Text Generation
Joseph Marvin Imperial, Harish Tayyar Madabushi

TL;DR
This paper introduces UCTG, a benchmark for evaluating and improving the consistency of text complexity in generative NLP models, highlighting current models' struggles to maintain input prompt complexity.
Contribution
The paper presents a new benchmark, UCTG, and evaluates over 150 features to assess and enhance the uniformity of linguistic complexity in text generation.
Findings
Models like GPT-2 struggle to preserve input complexity.
Finetuning with professional texts does not fully solve complexity preservation.
The benchmark enables systematic evaluation of complexity consistency.
Abstract
Large language models (LLMs) have shown promising results in a wide array of generative NLP tasks, such as summarization and machine translation. In the context of narrative generation, however, existing models still do not capture factors that contribute to producing consistent text. For instance, it is logical that a piece of text or a story should be uniformly readable throughout and that this form of complexity should be controllable. As such, if the complexity of an input text prompt is rated first-grade reading level in the Flesch Reading Ease test, then the generated text continuing the plot should also be within this range of complexity. With this in mind, we introduce Uniform Complexity for Text Generation (UCTG), a new benchmark test which raises the challenge of making generative models observe uniform linguistic properties with respect to prompts. We experiment with over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Games
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Byte Pair Encoding · Dense Connections · Attention Dropout · Linear Warmup With Cosine Annealing · Weight Decay
