Retrieve-and-Verify: A Table Context Selection Framework for Accurate Column Annotations
Zhihao Ding, Yongkang Sun, Jieming Shi

TL;DR
This paper introduces a retrieve-and-verify framework with two methods, REVEAL and REVEAL+, that improve the accuracy of column annotations in tables by selecting and verifying relevant context, outperforming existing approaches.
Contribution
The paper proposes a novel context selection and verification framework for column annotation, addressing the limitations of coarse-grained methods in wide tables.
Findings
REVEAL effectively selects compact, relevant column contexts.
REVEAL+ improves annotation accuracy through context verification.
The methods outperform state-of-the-art baselines on six benchmark datasets.
Abstract
Tables are a prevalent format for structured data, yet their metadata, such as semantic types and column relationships, is often incomplete or ambiguous. Column annotation tasks, including Column Type Annotation (CTA) and Column Property Annotation (CPA), address this by leveraging table context, which are critical for data management. Existing methods typically serialize all columns in a table into pretrained language models to incorporate context, but this coarse-grained approach often degrades performance in wide tables with many irrelevant or misleading columns. To address this, we propose a novel retrieve-and-verify context selection framework for accurate column annotation, introducing two methods: REVEAL and REVEAL+. In REVEAL, we design an efficient unsupervised retrieval technique to select compact, informative column contexts by balancing semantic relevance and diversity, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
