Verifying Text Summaries of Relational Data Sets
Saehan Jo, Immanuel Trummer, Weicheng Yu, Daniel Liu, Xuezhi Wang,, Cong Yu, Niyati Mehta

TL;DR
The paper introduces the AggChecker, a probabilistic tool for verifying the accuracy of natural language claims in text summaries of relational data, improving fact-checking efficiency and accuracy.
Contribution
It presents a novel probabilistic model and system for automatically verifying text claims against data, enabling faster and more accurate fact-checking of summaries.
Findings
Revealed errors in about one-third of tested articles.
Users checked summaries six times faster with AggChecker.
Achieved higher recall and precision than existing baselines.
Abstract
We present a novel natural language query interface, the AggChecker, aimed at text summaries of relational data sets. The tool focuses on natural language claims that translate into an SQL query and a claimed query result. Similar in spirit to a spell checker, the AggChecker marks up text passages that seem to be inconsistent with the actual data. At the heart of the system is a probabilistic model that reasons about the input document in a holistic fashion. Based on claim keywords and the document structure, it maps each text claim to a probability distribution over associated query translations. By efficiently executing tens to hundreds of thousands of candidate translations for a typical input document, the system maps text claims to correctness probabilities. This process becomes practical via a specialized processing backend, avoiding redundant work via query merging and result…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Data Quality and Management
