Retrieve-and-Verify: A Table Context Selection Framework for Accurate Column Annotations

Zhihao Ding; Yongkang Sun; Jieming Shi

arXiv:2508.17203·cs.DB·August 26, 2025

Retrieve-and-Verify: A Table Context Selection Framework for Accurate Column Annotations

Zhihao Ding, Yongkang Sun, Jieming Shi

PDF

TL;DR

This paper introduces a retrieve-and-verify framework with two methods, REVEAL and REVEAL+, that improve the accuracy of column annotations in tables by selecting and verifying relevant context, outperforming existing approaches.

Contribution

The paper proposes a novel context selection and verification framework for column annotation, addressing the limitations of coarse-grained methods in wide tables.

Findings

01

REVEAL effectively selects compact, relevant column contexts.

02

REVEAL+ improves annotation accuracy through context verification.

03

The methods outperform state-of-the-art baselines on six benchmark datasets.

Abstract

Tables are a prevalent format for structured data, yet their metadata, such as semantic types and column relationships, is often incomplete or ambiguous. Column annotation tasks, including Column Type Annotation (CTA) and Column Property Annotation (CPA), address this by leveraging table context, which are critical for data management. Existing methods typically serialize all columns in a table into pretrained language models to incorporate context, but this coarse-grained approach often degrades performance in wide tables with many irrelevant or misleading columns. To address this, we propose a novel retrieve-and-verify context selection framework for accurate column annotation, introducing two methods: REVEAL and REVEAL+. In REVEAL, we design an efficient unsupervised retrieval technique to select compact, informative column contexts by balancing semantic relevance and diversity, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.