Zero-shot Faithfulness Evaluation for Text Summarization with Foundation   Language Model

Qi Jia; Siyu Ren; Yizhu Liu; Kenny Q. Zhu

arXiv:2310.11648·cs.CL·December 15, 2023·2 cites

Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

Qi Jia, Siyu Ren, Yizhu Liu, Kenny Q. Zhu

PDF

Open Access 1 Repo

TL;DR

This paper introduces FFLM, a zero-shot metric using a moderately-sized foundation language model to evaluate the faithfulness of text summarization, outperforming larger models like ChatGPT in accuracy and efficiency.

Contribution

The paper presents FFLM, a novel zero-shot faithfulness evaluation metric that requires fewer parameters and achieves competitive or superior performance compared to existing methods.

Findings

01

FFLM outperforms ChatGPT in faithfulness detection and rating.

02

FFLM requires 24x fewer parameters than ChatGPT.

03

FFLM achieves competitive results on inconsistency detection.

Abstract

Despite tremendous improvements in natural language generation, summarization models still suffer from the unfaithfulness issue. Previous work evaluates faithfulness either using models trained on the other tasks or in-domain synthetic data, or prompting a large model such as ChatGPT. This paper proposes to do zero-shot faithfulness evaluation simply with a moderately-sized foundation language model. We introduce a new metric FFLM, which is a combination of probability changes based on the intuition that prefixing a piece of text that is consistent with the output will increase the probability of predicting the output. Experiments show that FFLM performs competitively with or even outperforms ChatGPT on both inconsistency detection and faithfulness rating with 24x fewer parameters. FFLM also achieves improvements over other strong baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiaqisjtu/faitheval-fflm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification