Automating Text Naturalness Evaluation of NLG Systems

Erion \c{C}ano; Ond\v{r}ej Bojar

arXiv:2006.13268·cs.CL·June 25, 2020

Automating Text Naturalness Evaluation of NLG Systems

Erion \c{C}ano, Ond\v{r}ej Bojar

PDF

Open Access

TL;DR

This paper proposes an automated method for evaluating the naturalness of generated text using pretrained language models, aiming to replace human judgments with a scalable, consistent metric.

Contribution

It introduces a novel automatic evaluation approach for text naturalness based on language model probabilities and analyzes the impact of model size on evaluation quality.

Findings

01

Larger models improve naturalness evaluation accuracy

02

Model size influences the reliability of the human likeliness metric

03

Further validation with human judgments is needed

Abstract

Automatic methods and metrics that assess various quality criteria of automatically generated texts are important for developing NLG systems because they produce repeatable results and allow for a fast development cycle. We present here an attempt to automate the evaluation of text naturalness which is a very important characteristic of natural language generation methods. Instead of relying on human participants for scoring or labeling the text samples, we propose to automate the process by using a human likeliness metric we define and a discrimination procedure based on large pretrained language models with their probability distributions. We analyze the text probability fractions and observe how they are influenced by the size of the generative and discriminative models involved in the process. Based on our results, bigger generators and larger pretrained discriminators are more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification