Toward Real-World Table Agents: Capabilities, Workflows, and Design Principles for LLM-based Table Intelligence

Jiaming Tian; Liyao Li; Wentao Ye; Haobo Wang; Lingxin Wang; Lihua Yu; Zujie Ren; Gang Chen; Junbo Zhao

arXiv:2507.10281·cs.AI·July 15, 2025

Toward Real-World Table Agents: Capabilities, Workflows, and Design Principles for LLM-based Table Intelligence

Jiaming Tian, Liyao Li, Wentao Ye, Haobo Wang, Lingxin Wang, Lihua Yu, Zujie Ren, Gang Chen, Junbo Zhao

PDF

TL;DR

This paper surveys LLM-based Table Agents, highlighting their capabilities, challenges in real-world noisy and heterogeneous data, and providing insights to enhance their robustness and generalization in practical applications.

Contribution

It introduces five core competencies for LLM-based Table Agents and analyzes the performance gap between academic benchmarks and real-world scenarios.

Findings

01

Performance gap in Text-to-SQL tasks for open-source models

02

Identification of five core competencies for table intelligence

03

Insights for improving robustness and generalization

Abstract

Tables are fundamental in domains such as finance, healthcare, and public administration, yet real-world table tasks often involve noise, structural heterogeneity, and semantic complexity--issues underexplored in existing research that primarily targets clean academic datasets. This survey focuses on LLM-based Table Agents, which aim to automate table-centric workflows by integrating preprocessing, reasoning, and domain adaptation. We define five core competencies--C1: Table Structure Understanding, C2: Table and Query Semantic Understanding, C3: Table Retrieval and Compression, C4: Executable Reasoning with Traceability, and C5: Cross-Domain Generalization--to analyze and compare current approaches. In addition, a detailed examination of the Text-to-SQL Agent reveals a performance gap between academic benchmarks and real-world scenarios, especially for open-source models. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.