MT-RAIG: Novel Benchmark and Evaluation Framework for Retrieval-Augmented Insight Generation over Multiple Tables
Kwangwook Seo, Donguk Kwon, and Dongha Lee

TL;DR
This paper introduces MT-RAIG Bench, a new benchmark and evaluation framework for assessing systems that generate insights by reasoning over multiple tables, addressing limitations of previous single-table approaches.
Contribution
It presents a novel benchmark and a fine-grained evaluation framework for multi-table insight generation, advancing beyond existing single-table reasoning methods.
Findings
Current LLMs struggle with complex multi-table reasoning
MT-RAIG Bench provides a challenging testbed for future research
The evaluation framework aligns better with human judgments
Abstract
Recent advancements in table-based reasoning have expanded beyond factoid-level QA to address insight-level tasks, where systems should synthesize implicit knowledge in the table to provide explainable analyses. Although effective, existing studies remain confined to scenarios where a single gold table is given alongside the user query, failing to address cases where users seek comprehensive insights from multiple unknown tables. To bridge these gaps, we propose MT-RAIG Bench, design to evaluate systems on Retrieval-Augmented Insight Generation over Mulitple-Tables. Additionally, to tackle the suboptimality of existing automatic evaluation methods in the table domain, we further introduce a fine-grained evaluation framework MT-RAIG Eval, which achieves better alignment with human quality judgments on the generated insights. We conduct extensive experiments and reveal that even frontier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsInformation Retrieval and Search Behavior · Image Retrieval and Classification Techniques
