Bridge the Gap between Language models and Tabular Understanding

Nuo Chen; Linjun Shou; Ming Gong; Jian Pei; Chenyu You; Jianhui Chang,; Daxin Jiang; Jia Li

arXiv:2302.09302·cs.CL·February 21, 2023·6 cites

Bridge the Gap between Language models and Tabular Understanding

Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Chenyu You, Jianhui Chang,, Daxin Jiang, Jia Li

PDF

Open Access

TL;DR

This paper introduces UTP, a pre-training approach for tabular language models that supports multiple input types and uses contrastive learning to improve table-text understanding and task performance.

Contribution

UTP is a novel pre-training method that dynamically handles table, text, and combined inputs, bridging the gap between pre-training and fine-tuning phases.

Findings

01

UTP outperforms existing models on table retrieval tasks.

02

UTP achieves superior results on table question answering.

03

The approach improves alignment between table and text modalities.

Abstract

Table pretrain-then-finetune paradigm has been proposed and employed at a rapid pace after the success of pre-training in the natural language domain. Despite the promising findings in tabular pre-trained language models (TPLMs), there is an input gap between pre-training and fine-tuning phases. For instance, TPLMs jointly pre-trained with table and text input could be effective for tasks also with table-text joint input like table question answering, but it may fail for tasks with only tables or text as input such as table retrieval. To this end, we propose UTP, an approach that dynamically supports three types of multi-modal inputs: table-text, table, and text. Specifically, UTP is pre-trained with two strategies: (1) We first utilize a universal mask language modeling objective on each kind of input, enforcing the model to adapt various inputs. (2) We then present Cross-Modal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

Methodsfail · Contrastive Learning