Asking and Answering Questions to Evaluate the Factual Consistency of   Summaries

Alex Wang; Kyunghyun Cho; and Mike Lewis

arXiv:2004.04228·cs.CL·April 10, 2020·32 cites

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

Alex Wang, Kyunghyun Cho, and Mike Lewis

PDF

Open Access 2 Repos

TL;DR

QAGS is an automatic evaluation protocol that detects factual inconsistencies in summaries by asking questions about the source and summary, showing higher correlation with human judgments and providing interpretability.

Contribution

The paper introduces QAGS, a novel question-answering based metric for factual consistency in summarization, outperforming existing metrics in correlation with human judgments.

Findings

01

QAGS correlates better with human judgments than other metrics.

02

QAGS provides interpretability by highlighting inconsistent tokens.

03

QAGS effectively identifies factual errors in summarization datasets.

Abstract

Practical applications of abstractive summarization models are limited by frequent factual inconsistencies with respect to their input. Existing automatic evaluation metrics for summarization are largely insensitive to such errors. We propose an automatic evaluation protocol called QAGS (pronounced "kags") that is designed to identify factual inconsistencies in a generated summary. QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source. To evaluate QAGS, we collect human judgments of factual consistency on model-generated summaries for the CNN/DailyMail (Hermann et al., 2015) and XSUM (Narayan et al., 2018) summarization datasets. QAGS has substantially higher correlations with these judgments than other automatic evaluation metrics. Also, QAGS offers a natural form of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques