Investigating Table-to-Text Generation Capabilities of LLMs in Real-World Information Seeking Scenarios
Yilun Zhao, Haowei Zhang, Shengyun Si, Linyong Nan, Xiangru Tang,, Arman Cohan

TL;DR
This paper evaluates the ability of large language models, especially GPT-4, to generate, evaluate, and provide feedback on table-to-text tasks in real-world information seeking scenarios, highlighting GPT-4's effectiveness and the gap with open-source models.
Contribution
It introduces new datasets for data insight and query-based table-to-text generation and systematically assesses LLMs' capabilities in these tasks in real-world contexts.
Findings
GPT-4 effectively generates, evaluates, and provides feedback on table-to-text tasks.
Open-source LLMs like Tulu and LLaMA-2 lag behind GPT-4 in performance.
Significant performance gap exists between GPT-4 and other open-source models.
Abstract
Tabular data is prevalent across various industries, necessitating significant time and effort for users to understand and manipulate for their information-seeking purposes. The advancements in large language models (LLMs) have shown enormous potential to improve user efficiency. However, the adoption of LLMs in real-world applications for table information seeking remains underexplored. In this paper, we investigate the table-to-text capabilities of different LLMs using four datasets within two real-world information seeking scenarios. These include the LogicNLG and our newly-constructed LoTNLG datasets for data insight generation, along with the FeTaQA and our newly-constructed F2WTQ datasets for query-based generation. We structure our investigation around three research questions, evaluating the performance of LLMs in table-to-text generation, automated evaluation, and feedback…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Information Retrieval and Search Behavior
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Residual Connection · Byte Pair Encoding · Dense Connections · Layer Normalization · Label Smoothing · Position-Wise Feed-Forward Layer
