DEE: Dual-stage Explainable Evaluation Method for Text Generation

Shenyu Zhang; Yu Li; Rui Wu; Xiutian Huang; Yongrui Chen; Wenhao Xu,; Guilin Qi

arXiv:2403.11509·cs.CL·March 19, 2024·1 cites

DEE: Dual-stage Explainable Evaluation Method for Text Generation

Shenyu Zhang, Yu Li, Rui Wu, Xiutian Huang, Yongrui Chen, Wenhao Xu,, Guilin Qi

PDF

Open Access

TL;DR

DEE is a dual-stage, explainable evaluation method for text generation that efficiently identifies errors and provides detailed diagnostics, improving over existing methods especially in industrial contexts.

Contribution

The paper introduces DEE, a novel dual-stage evaluation framework built on Llama 2, with a new dataset AntEval, enabling comprehensive and explainable assessment of generated texts.

Findings

01

DEE outperforms existing evaluation methods in human correlation.

02

DEE achieves higher efficiency in error detection.

03

The AntEval dataset covers new issues like hallucination and toxicity.

Abstract

Automatic methods for evaluating machine-generated texts hold significant importance due to the expanding applications of generative systems. Conventional methods tend to grapple with a lack of explainability, issuing a solitary numerical score to signify the assessment outcome. Recent advancements have sought to mitigate this limitation by incorporating large language models (LLMs) to offer more detailed error analyses, yet their applicability remains constrained, particularly in industrial contexts where comprehensive error coverage and swift detection are paramount. To alleviate these challenges, we introduce DEE, a Dual-stage Explainable Evaluation method for estimating the quality of text generation. Built upon Llama 2, DEE follows a dual-stage principle guided by stage-specific instructions to perform efficient identification of errors in generated texts in the initial stage and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques

MethodsLLaMA