Web Table Extraction, Retrieval and Augmentation: A Survey

Shuo Zhang; Krisztian Balog

arXiv:2002.00207·cs.IR·February 6, 2020·26 cites

Web Table Extraction, Retrieval and Augmentation: A Survey

Shuo Zhang, Krisztian Balog

PDF

Open Access

TL;DR

This survey comprehensively reviews two decades of research on web tables, covering extraction, interpretation, search, question answering, and augmentation, highlighting key approaches and resources.

Contribution

It systematically organizes existing literature into six main categories, providing a structured overview of the field and identifying interdependencies among tasks.

Findings

01

Six main categories of web table research identified

02

Seminal approaches and resources summarized for each category

03

Interdependencies among different web table tasks highlighted

Abstract

Tables are a powerful and popular tool for organizing and manipulating data. A vast number of tables can be found on the Web, which represents a valuable knowledge resource. The objective of this survey is to synthesize and present two decades of research on web tables. In particular, we organize existing literature into six main categories of information access tasks: table extraction, table interpretation, table search, question answering, knowledge base augmentation, and table augmentation. For each of these tasks, we identify and describe seminal approaches, present relevant resources, and point out interdependencies among the different tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Web Data Mining and Analysis · Semantic Web and Ontologies