Table Search Using a Deep Contextualized Language Model

Zhiyu Chen; Mohamed Trabelsi; Jeff Heflin; Yinan Xu; Brian D. Davison

arXiv:2005.09207·cs.IR·May 28, 2020

Table Search Using a Deep Contextualized Language Model

Zhiyu Chen, Mohamed Trabelsi, Jeff Heflin, Yinan Xu, Brian D. Davison

PDF

1 Repo

TL;DR

This paper leverages BERT, a deep contextualized language model, to improve ad hoc table retrieval by encoding table content effectively and combining it with prior features, achieving state-of-the-art results.

Contribution

It introduces a novel approach that encodes table structure with BERT and integrates prior retrieval features, enhancing table search performance.

Findings

01

Outperforms previous state-of-the-art methods on public datasets.

02

Significantly improves retrieval accuracy over BERT baselines.

03

Demonstrates the effectiveness of combining BERT with traditional features.

Abstract

Pretrained contextualized language models such as BERT have achieved impressive results on various natural language processing benchmarks. Benefiting from multiple pretraining tasks and large scale training corpora, pretrained models can capture complex syntactic word relations. In this paper, we use the deep contextualized language model BERT for the task of ad hoc table retrieval. We investigate how to encode table content considering the table structure and input length limit of BERT. We also propose an approach that incorporates features from prior literature on table retrieval and jointly trains them with BERT. In experiments on public datasets, we show that our best approach can outperform the previous state-of-the-art method and BERT baselines with a large margin under different evaluation metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zhiyu-Chen/SIGIR2020-BERT-Table-Search
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Weight Decay · Softmax · Adam · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · Dense Connections