MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark
Junjie Xing, Yeye He, Mengyu Zhou, Haoyu Dong, Shi Han, Lingjiao Chen, Dongmei Zhang, Surajit Chaudhuri, H. V. Jagadish

TL;DR
MMTU is a comprehensive benchmark with over 28,000 questions across 25 real-world table tasks, designed to evaluate models' ability to understand, reason, and manipulate complex tables at an expert level, revealing significant room for improvement.
Contribution
This work introduces MMTU, the first large-scale, diverse benchmark for comprehensive table understanding and reasoning, covering a broad spectrum of real-world tasks faced by professional users.
Findings
Current models like GPT-5 and DeepSeek R1 score only around 69% and 57% respectively.
Models struggle with complex table understanding, reasoning, and coding tasks.
There is substantial room for improvement in models' performance on real-world table tasks.
Abstract
Tables and table-based use cases play a crucial role in many important real-world applications, such as spreadsheets, databases, and computational notebooks, which traditionally require expert-level users like data engineers, data analysts, and database administrators to operate. Although LLMs have shown remarkable progress in working with tables (e.g., in spreadsheet and database copilot scenarios), comprehensive benchmarking of such capabilities remains limited. In contrast to an extensive and growing list of NLP benchmarks, evaluations of table-related tasks are scarce, and narrowly focus on tasks like NL-to-SQL and Table-QA, overlooking the broader spectrum of real-world tasks that professional users face. This gap limits our understanding and model progress in this important area. In this work, we introduce MMTU, a large-scale benchmark with over 28K questions across 25…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Quality and Management · Data Visualization and Analytics · Advanced Database Systems and Queries
MethodsFocus
