Bridging the Gap: Deciphering Tabular Data Using Large Language Model
Hengyuan Zhang, Peng Chang, Zongcheng Ji

TL;DR
This paper introduces a novel approach to enhance large language models' understanding of tabular data for question answering, involving table serialization and correction mechanisms, achieving competitive results on specific datasets.
Contribution
It is the first to apply large language models to table-based question answering, improving comprehension of table structure and content through new modules.
Findings
Our method surpasses SOTA by 1.2% on certain datasets.
Overall performance is 11.7% below the SOTA.
First application of LLMs to table question answering.
Abstract
In the realm of natural language processing, the understanding of tabular data has perpetually stood as a focal point of scholarly inquiry. The emergence of expansive language models, exemplified by the likes of ChatGPT, has ushered in a wave of endeavors wherein researchers aim to harness these models for tasks related to table-based question answering. Central to our investigative pursuits is the elucidation of methodologies that amplify the aptitude of such large language models in discerning both the structural intricacies and inherent content of tables, ultimately facilitating their capacity to provide informed responses to pertinent queries. To this end, we have architected a distinctive module dedicated to the serialization of tables for seamless integration with expansive language models. Additionally, we've instituted a corrective mechanism within the model to rectify potential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques
