Estimation of Summary-to-Text Inconsistency by Mismatched Embeddings

Oleg Vasilyev; John Bohannon

arXiv:2104.05156·cs.CL·April 13, 2021

Estimation of Summary-to-Text Inconsistency by Mismatched Embeddings

Oleg Vasilyev, John Bohannon

PDF

Open Access

TL;DR

This paper introduces ESTIME, a new reference-free measure for evaluating summary faithfulness by detecting minute inconsistencies with source documents, showing strong correlation with expert scores and sensitivity to subtle errors.

Contribution

The paper presents ESTIME, a novel embedding-based metric for assessing summary-to-text consistency, outperforming existing measures in detecting subtle factual errors.

Findings

01

ESTIME correlates strongly with expert scores on SummEval.

02

ESTIME is more sensitive to subtle factual errors than existing metrics.

03

The method effectively detects minute inconsistencies in summaries.

Abstract

We propose a new reference-free summary quality evaluation measure, with emphasis on the faithfulness. The measure is designed to find and count all possible minute inconsistencies of the summary with respect to the source document. The proposed ESTIME, Estimator of Summary-to-Text Inconsistency by Mismatched Embeddings, correlates with expert scores in summary-level SummEval dataset stronger than other common evaluation measures not only in Consistency but also in Fluency. We also introduce a method of generating subtle factual errors in human summaries. We show that ESTIME is more sensitive to subtle errors than other common evaluation measures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques