Exploring the Use of Large Language Models for Reference-Free Text   Quality Evaluation: An Empirical Study

Yi Chen; Rui Wang; Haiyun Jiang; Shuming Shi; Ruifeng Xu

arXiv:2304.00723·cs.CL·September 19, 2023·20 cites

Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: An Empirical Study

Yi Chen, Rui Wang, Haiyun Jiang, Shuming Shi, Ruifeng Xu

PDF

Open Access

TL;DR

This study investigates the effectiveness of large language models, especially ChatGPT, in reference-free text quality evaluation, demonstrating their superior performance over existing automatic metrics and proposing an explicit scoring method.

Contribution

The paper introduces a novel evaluation approach using ChatGPT for reference-free text quality assessment and compares three methods, highlighting the effectiveness of the Explicit Score.

Findings

01

ChatGPT outperforms most automatic metrics in text quality evaluation.

02

The Explicit Score method provides the most reliable and effective assessment.

03

Direct comparison of two texts using ChatGPT may sometimes be suboptimal.

Abstract

Evaluating the quality of generated text is a challenging task in NLP, due to the inherent complexity and diversity of text. Recently, large language models (LLMs) have garnered significant attention due to their impressive performance in various tasks. Therefore, we present this paper to investigate the effectiveness of LLMs, especially ChatGPT, and explore ways to optimize their use in assessing text quality. We compared three kinds of reference-free evaluation methods. The experimental results prove that ChatGPT is capable of evaluating text quality effectively from various perspectives without reference and demonstrates superior performance than most existing automatic metrics. In particular, the Explicit Score, which utilizes ChatGPT to generate a numeric score measuring text quality, is the most effective and reliable method among the three exploited approaches. However, directly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification