Sticking to the Facts: Confident Decoding for Faithful Data-to-Text   Generation

Ran Tian; Shashi Narayan; Thibault Sellam; Ankur P. Parikh

arXiv:1910.08684·cs.CL·November 3, 2020·48 cites

Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation

Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh

PDF

Open Access

TL;DR

This paper introduces a confidence scoring method combined with variational Bayes training to reduce hallucinations in data-to-text generation, leading to more faithful and accurate text outputs aligned with source data.

Contribution

It proposes a novel confidence score mechanism and a variational Bayes training framework to improve faithfulness in data-to-text generation models.

Findings

01

Outperforms existing methods on WikiBio dataset in faithfulness.

02

Achieves strong results on WebNLG dataset.

03

Demonstrates improved human evaluation scores for faithfulness.

Abstract

We address the issue of hallucination in data-to-text generation, i.e., reducing the generation of text that is unsupported by the source. We conjecture that hallucination can be caused by an encoder-decoder model generating content phrases without attending to the source; so we propose a confidence score to ensure that the model attends to the source whenever necessary, as well as a variational Bayes training framework that can learn the score from data. Experiments on the WikiBio (Lebretet al., 2016) dataset show that our approach is more faithful to the source than existing state-of-the-art approaches, according to both PARENT score (Dhingra et al., 2019) and human evaluation. We also report strong results on the WebNLG (Gardent et al., 2017) dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications