FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction
Alessandro Scir\`e, Karim Ghonim, Roberto Navigli

TL;DR
FENICE is a new, interpretable, and efficient factuality metric for summarization that uses natural language inference and claim extraction to improve factual consistency evaluation, outperforming existing benchmarks especially on long summaries.
Contribution
The paper introduces FENICE, a novel factuality evaluation metric combining NLI and claim extraction, addressing interpretability and efficiency issues of prior methods.
Findings
FENICE achieves state-of-the-art results on AGGREFACT benchmark.
It effectively evaluates factuality in long-form summaries.
The method is more interpretable and computationally practical.
Abstract
Recent advancements in text summarization, particularly with the advent of Large Language Models (LLMs), have shown remarkable performance. However, a notable challenge persists as a substantial number of automatically-generated summaries exhibit factual inconsistencies, such as hallucinations. In response to this issue, various approaches for the evaluation of consistency for summarization have emerged. Yet, these newly-introduced metrics face several limitations, including lack of interpretability, focus on short document summaries (e.g., news articles), and computational impracticality, especially for LLM-based metrics. To address these shortcomings, we propose Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction (FENICE), a more interpretable and efficient factuality-oriented metric. FENICE leverages an NLI-based alignment between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsSparse Evolutionary Training · Focus
