TL;DR
This paper introduces a dependency-arc-based entailment approach to evaluate factual consistency in text generation, effectively localizing errors and outperforming sentence-level methods in identifying factual inaccuracies.
Contribution
It proposes a novel dependency-level entailment formulation and a data creation method, improving factuality evaluation and error localization in generated text.
Findings
Dependency arc entailment outperforms sentence-level models in factuality detection.
The method effectively localizes non-factual parts in generated text.
Automatic data creation enhances model training and evaluation.
Abstract
Despite significant progress in text generation models, a serious limitation is their tendency to produce text that is factually inconsistent with information in the input. Recent work has studied whether textual entailment systems can be used to identify factual errors; however, these sentence-level entailment models are trained to solve a different problem than generation filtering and they do not localize which part of a generation is non-factual. In this paper, we propose a new formulation of entailment that decomposes it at the level of dependency arcs. Rather than focusing on aggregate decisions, we instead ask whether the semantic relationship manifested by individual dependency arcs in the generated output is supported by the input. Human judgments on this task are difficult to obtain; we therefore propose a method to automatically create data based on existing entailment or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
