Uncovering Limitations of Large Language Models in Information Seeking   from Tables

Chaoxu Pang; Yixuan Cao; Chunhao Yang; Ping Luo

arXiv:2406.04113·cs.CL·June 7, 2024

Uncovering Limitations of Large Language Models in Information Seeking from Tables

Chaoxu Pang, Yixuan Cao, Chunhao Yang, Ping Luo

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper evaluates large language models' ability to seek information from tables, introducing a new benchmark that reveals their limitations in understanding table structures and robustness, highlighting the need for improved models.

Contribution

It presents a reliable, question-based benchmark for Table Information Seeking and analyzes LLMs' performance, exposing their deficiencies in understanding tables and robustness issues.

Findings

01

GPT-4-turbo performs marginally well

02

Most models perform inadequately in TIS tasks

03

LLMs struggle with table structure comprehension

Abstract

Tables are recognized for their high information density and widespread usage, serving as essential sources of information. Seeking information from tables (TIS) is a crucial capability for Large Language Models (LLMs), serving as the foundation of knowledge-based Q&A systems. However, this field presently suffers from an absence of thorough and reliable evaluation. This paper introduces a more reliable benchmark for Table Information Seeking (TabIS). To avoid the unreliable evaluation caused by text similarity-based metrics, TabIS adopts a single-choice question format (with two options per question) instead of a text generation format. We establish an effective pipeline for generating options, ensuring their difficulty and quality. Experiments conducted on 12 LLMs reveal that while the performance of GPT-4-turbo is marginally satisfactory, both other proprietary and open-source models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

coszero/TabIS
noneOfficial

Videos

Uncovering Limitations of Large Language Models in Information Seeking from Tables· underline

Taxonomy

TopicsTopic Modeling · Data Quality and Management · Advanced Text Analysis Techniques