Search and Learn: Improving Semantic Coverage for Data-to-Text Generation
Shailza Jolly, Zi Xuan Zhang, Andreas Dengel, Lili Mou

TL;DR
This paper introduces a search-and-learning method that enhances semantic coverage in data-to-text generation, especially in few-shot settings, by inserting missing input slots into generated text, leading to more accurate and comprehensive descriptions.
Contribution
The work presents a novel search-and-learning approach that significantly improves semantic coverage in few-shot data-to-text generation using pretrained language models.
Findings
Achieves 98.35% slot coverage on E2E dataset
Improves semantic coverage and inference efficiency
Outperforms baseline models on benchmark datasets
Abstract
Data-to-text generation systems aim to generate text descriptions based on input data (often represented in the tabular form). A typical system uses huge training samples for learning the correspondence between tables and texts. However, large training sets are expensive to obtain, limiting the applicability of these approaches in real-world scenarios. In this work, we focus on few-shot data-to-text generation. We observe that, while fine-tuned pretrained language models may generate plausible sentences, they suffer from the low semantic coverage problem in the few-shot setting. In other words, important input slots tend to be missing in the generated text. To this end, we propose a search-and-learning approach that leverages pretrained language models but inserts the missing slots to improve the semantic coverage. We further fine-tune our system based on the search results to smooth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
