Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model
Carina Negreanu, Alperen Karaoglu, Jack Williams, Shuang Chen, Daniel, Fabian, Andrew Gordon, Chin-Yew Lin

TL;DR
This paper enhances row completion in tables by combining knowledge base interpretation with GPT-3 generated text, achieving state-of-the-art results on the WikiTables benchmark.
Contribution
It introduces a novel approach that integrates knowledge base linking with large language model synthesis to improve row augmentation in tables.
Findings
State-of-the-art performance on WikiTables benchmark
Effective synthesis of diverse rows using GPT-3
Improved metadata generation through property linking
Abstract
Row completion is the task of augmenting a given table of text and numbers with additional, relevant rows. The task divides into two steps: subject suggestion, the task of populating the main column; and gap filling, the task of populating the remaining columns. We present state-of-the-art results for subject suggestion and gap filling measured on a standard benchmark (WikiTables). Our idea is to solve this task by harmoniously combining knowledge base table interpretation and free text generation. We interpret the table using the knowledge base to suggest new rows and generate metadata like headers through property linking. To improve candidate diversity, we synthesize additional rows using free text generation via GPT-3, and crucially, we exploit the metadata we interpret to produce better prompts for text generation. Finally, we verify that the additional synthesized content can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
MethodsAttention Is All You Need · Linear Layer · Cosine Annealing · Weight Decay · Dropout · Adam · Byte Pair Encoding · {Dispute@FaQ-s}How to file a dispute with Expedia? · Dense Connections · Attention Dropout
