Search and Learn: Improving Semantic Coverage for Data-to-Text   Generation

Shailza Jolly; Zi Xuan Zhang; Andreas Dengel; Lili Mou

arXiv:2112.02770·cs.CL·December 7, 2021

Search and Learn: Improving Semantic Coverage for Data-to-Text Generation

Shailza Jolly, Zi Xuan Zhang, Andreas Dengel, Lili Mou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a search-and-learning method that enhances semantic coverage in data-to-text generation, especially in few-shot settings, by inserting missing input slots into generated text, leading to more accurate and comprehensive descriptions.

Contribution

The work presents a novel search-and-learning approach that significantly improves semantic coverage in few-shot data-to-text generation using pretrained language models.

Findings

01

Achieves 98.35% slot coverage on E2E dataset

02

Improves semantic coverage and inference efficiency

03

Outperforms baseline models on benchmark datasets

Abstract

Data-to-text generation systems aim to generate text descriptions based on input data (often represented in the tabular form). A typical system uses huge training samples for learning the correspondence between tables and texts. However, large training sets are expensive to obtain, limiting the applicability of these approaches in real-world scenarios. In this work, we focus on few-shot data-to-text generation. We observe that, while fine-tuned pretrained language models may generate plausible sentences, they suffer from the low semantic coverage problem in the few-shot setting. In other words, important input slots tend to be missing in the generated text. To this end, we propose a search-and-learning approach that leverages pretrained language models but inserts the missing slots to improve the semantic coverage. We further fine-tune our system based on the search results to smooth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shailzajolly/fsdt
pytorchOfficial

Videos

Search and Learn: Improving Semantic Coverage for Data-to-Text Generation· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques