CTRLEval: An Unsupervised Reference-Free Metric for Evaluating   Controlled Text Generation

Pei Ke; Hao Zhou; Yankai Lin; Peng Li; Jie Zhou; Xiaoyan Zhu; Minlie; Huang

arXiv:2204.00862·cs.CL·December 6, 2022

CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation

Pei Ke, Hao Zhou, Yankai Lin, Peng Li, Jie Zhou, Xiaoyan Zhu, Minlie, Huang

PDF

Open Access 1 Repo

TL;DR

CTRLEval is an unsupervised, reference-free metric that evaluates controlled text generation by leveraging multiple text infilling tasks and pre-trained language models, showing superior correlation with human judgments and better generalization.

Contribution

It introduces a novel unsupervised, reference-free evaluation metric for controlled text generation that does not require model training and outperforms existing metrics in correlation and generalization.

Findings

01

Higher correlation with human judgments than baselines

02

Better generalization across different models and qualities

03

No model training required for evaluation

Abstract

Existing reference-free metrics have obvious limitations for evaluating controlled text generation models. Unsupervised metrics can only provide a task-agnostic evaluation result which correlates weakly with human judgments, whereas supervised ones may overfit task-specific data with poor generalization ability to other datasets. In this paper, we propose an unsupervised reference-free metric called CTRLEval, which evaluates controlled text generation from different aspects by formulating each aspect into multiple text infilling tasks. On top of these tasks, the metric assembles the generation probabilities from a pre-trained language model without any model training. Experimental results show that our metric has higher correlations with human judgments than other baselines, while obtaining better generalization of evaluating generated texts from different models and with different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thu-coai/ctrleval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · AI in Service Interactions