Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
Yuexiang Xie, Fei Sun, Yang Deng, Yaliang Li, Bolin Ding

TL;DR
This paper introduces a new automatic evaluation metric for text summarization that effectively measures factual consistency by isolating the influence of language prior, showing improved correlation with human judgments without auxiliary tasks.
Contribution
It proposes a novel counterfactual estimation-based metric that isolates language prior effects, enhancing factual consistency evaluation in summarization without auxiliary tasks.
Findings
Improves correlation with human judgments
Effective across multiple datasets
Simplifies evaluation process
Abstract
Despite significant progress has been achieved in text summarization, factual inconsistency in generated summaries still severely limits its practical applications. Among the key factors to ensure factual consistency, a reliable automatic evaluation metric is the first and the most crucial one. However, existing metrics either neglect the intrinsic cause of the factual inconsistency or rely on auxiliary tasks, leading to an unsatisfied correlation with human judgments or increasing the inconvenience of usage in practice. In light of these challenges, we propose a novel metric to evaluate the factual consistency in text summarization via counterfactual estimation, which formulates the causal relationship among the source document, the generated summary, and the language prior. We remove the effect of language prior, which can cause factual inconsistency, from the total causal effect on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
