TL;DR
StruBERT is a novel structure-aware BERT model that effectively combines textual and structural information of data tables, significantly improving table retrieval and similarity tasks.
Contribution
This paper introduces StruBERT, a new neural model that fuses textual and structural data for enhanced table understanding and retrieval.
Findings
Substantial improvements over state-of-the-art in retrieval metrics
Effective fusion of textual and structural information
Versatile application to multiple table-related tasks
Abstract
A large amount of information is stored in data tables. Users can search for data tables using a keyword-based query. A table is composed primarily of data values that are organized in rows and columns providing implicit structural information. A table is usually accompanied by secondary information such as the caption, page title, etc., that form the textual information. Understanding the connection between the textual and structural information is an important yet neglected aspect in table retrieval as previous methods treat each source of information independently. In addition, users can search for data tables that are similar to an existing table, and this setting can be seen as a content-based table retrieval. In this paper, we propose StruBERT, a structure-aware BERT model that fuses the textual and structural information of a data table to produce context-aware representations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · StruBERT: Structure-aware BERT for Table Search and Matching · Dropout · Dense Connections · Attention Dropout · Linear Warmup With Linear Decay · Layer Normalization · Weight Decay
