Testing the Limits of Unified Sequence to Sequence LLM Pretraining on   Diverse Table Data Tasks

Soumajyoti Sarkar; Leonard Lausen

arXiv:2310.00789·cs.CL·October 3, 2023·1 cites

Testing the Limits of Unified Sequence to Sequence LLM Pretraining on Diverse Table Data Tasks

Soumajyoti Sarkar, Leonard Lausen

PDF

Open Access

TL;DR

This paper explores the potential of a unified encoder-decoder LLM pretraining approach to handle diverse table data tasks effectively, demonstrating significant performance improvements across multiple scales and tasks.

Contribution

It introduces a shared pretraining methodology for encoder-decoder LLMs on table data, showing scalability from 770M to 11B parameters and benefits of self-supervised objectives.

Findings

01

Pretraining with self-supervised objectives boosts task performance.

02

Unified models perform well across diverse table tasks.

03

Scaling models improves accuracy on table-specific tasks.

Abstract

Tables stored in databases and tables which are present in web pages and articles account for a large part of semi-structured data that is available on the internet. It then becomes pertinent to develop a modeling approach with large language models (LLMs) that can be used to solve diverse table tasks such as semantic parsing, question answering as well as classification problems. Traditionally, there existed separate models specialized for each task individually. It raises the question of how far can we go to build a unified model that works well on some table tasks without significant degradation on others. To that end, we attempt at creating a shared modeling approach in the pretraining stage with encoder-decoder style LLMs that can cater to diverse tasks. We evaluate our approach that continually pretrains and finetunes different model families of T5 with data from tables and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques

MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Byte Pair Encoding · Dropout · Attention Dropout · Dense Connections · Inverse Square Root Schedule · Linear Layer