GPTScore: Evaluate as You Desire

Jinlan Fu; See-Kiong Ng; Zhengbao Jiang; Pengfei Liu

arXiv:2302.04166·cs.CL·February 14, 2023·81 cites

GPTScore: Evaluate as You Desire

Jinlan Fu, See-Kiong Ng, Zhengbao Jiang, Pengfei Liu

PDF

Open Access 3 Repos

TL;DR

GPTScore introduces a flexible, instruction-based evaluation framework leveraging pre-trained models' emergent abilities to assess generated texts across multiple facets without requiring annotated data.

Contribution

The paper presents GPTScore, a novel evaluation method that uses pre-trained models' zero-shot capabilities for customizable, multi-dimensional text quality assessment.

Findings

01

Effective evaluation across four text generation tasks

02

Ability to customize evaluation criteria via natural language instructions

03

No need for annotated datasets for different evaluation aspects

Abstract

Generative Artificial Intelligence (AI) has enabled the development of sophisticated models that are capable of producing high-caliber text, images, and other outputs through the utilization of large pre-trained models. Nevertheless, assessing the quality of the generation is an even more arduous task than the generation itself, and this issue has not been given adequate consideration recently. This paper proposes a novel evaluation framework, GPTScore, which utilizes the emergent abilities (e.g., zero-shot instruction) of generative pre-trained models to score generated texts. There are 19 pre-trained models explored in this paper, ranging in size from 80M (e.g., FLAN-T5-small) to 175B (e.g., GPT3). Experimental results on four text generation tasks, 22 evaluation aspects, and corresponding 37 datasets demonstrate that this approach can effectively allow us to achieve what one desires…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational Physics and Python Applications