Making Table Understanding Work in Practice
Madelon Hulsebos, Sneha Gathani, James Gale, Isil Dillig and, Paul Groth, \c{C}a\u{g}atay Demiralp

TL;DR
This paper examines the gap between high benchmark performance of deep learning models for table understanding and their practical deployment, proposing a framework and SigmaTyper tool to address real-world challenges.
Contribution
It introduces a practical framework for deploying table understanding models, including SigmaTyper, which combines hybrid modeling and human-in-the-loop customization.
Findings
SigmaTyper effectively detects semantic column types in real-world data.
The framework addresses domain customization and confidence issues.
Future research directions to improve practical table understanding.
Abstract
Understanding the semantics of tables at scale is crucial for tasks like data integration, preparation, and search. Table understanding methods aim at detecting a table's topic, semantic column types, column relations, or entities. With the rise of deep learning, powerful models have been developed for these tasks with excellent accuracy on benchmarks. However, we observe that there exists a gap between the performance of these models on these benchmarks and their applicability in practice. In this paper, we address the question: what do we need for these models to work in practice? We discuss three challenges of deploying table understanding models and propose a framework to address them. These challenges include 1) difficulty in customizing models to specific domains, 2) lack of training data for typical database tables often found in enterprises, and 3) lack of confidence in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Web Data Mining and Analysis
