TL;DR
This paper introduces a two-step method for automatically updating Wikipedia sentences to incorporate new facts, ensuring consistency and improving fact-checking training data.
Contribution
It presents a novel two-encoder sequence-to-sequence model with copy attention for fact-guided sentence rewriting, enhancing update accuracy and dataset augmentation.
Findings
Achieves highest SARI score on Wikipedia fact update dataset.
Synthetic data generation improves fact-checking accuracy with 13% error reduction.
Effective in rewriting sentences to incorporate new factual information.
Abstract
Online encyclopediae like Wikipedia contain large amounts of text that need frequent corrections and updates. The new information may contradict existing content in encyclopediae. In this paper, we focus on rewriting such dynamically changing articles. This is a challenging constrained generation task, as the output must be consistent with the new information and fit into the rest of the existing document. To this end, we propose a two-step solution: (1) We identify and remove the contradicting components in a target text for a given claim, using a neutralizing stance model; (2) We expand the remaining text to be consistent with the given claim, using a novel two-encoder sequence-to-sequence model with copy attention. Applied to a Wikipedia fact update dataset, our method successfully generates updated sentences for new claims, achieving the highest SARI score. Furthermore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
