TabRAG: Improving Tabular Document Question Answering for Retrieval Augmented Generation via Structured Representations

Jacob Si; Mike Qu; Michelle Lee; Marek Rei; Yingzhen Li

arXiv:2511.06582·cs.CL·February 3, 2026

TabRAG: Improving Tabular Document Question Answering for Retrieval Augmented Generation via Structured Representations

Jacob Si, Mike Qu, Michelle Lee, Marek Rei, Yingzhen Li

PDF

Open Access

TL;DR

TabRAG introduces a structured, parsing-based approach to improve question answering on tabular documents by preserving two-dimensional semantics and leveraging vision-language models for better extraction and understanding.

Contribution

It presents a novel framework that combines layout segmentation and hierarchical parsing with in-context learning to enhance tabular document question answering.

Findings

01

Outperforms existing parsing techniques on multiple benchmarks.

02

Effective in handling various table styles and formats.

03

Improves the accuracy of question answering on tabular data.

Abstract

Incorporating external knowledge bases in traditional retrieval-augmented generation (RAG) relies on parsing the document, followed by querying a language model with the parsed information via in-context learning. While effective for text-based documents, question answering on tabular documents often fails to generate plausible responses. Standard parsing techniques lose the two-dimensional structural semantics critical for cell interpretation. In this work, we present TabRAG, a parsing-based RAG framework designed to improve tabular document question answering via structured representations. Our framework consists of layout segmentation that decomposes the document inputs into a series of components, enabling fine-grained extraction. Subsequently, a vision language model parses and extracts the document tables into a hierarchically structured representation. In order to cater various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Topic Modeling · Multimodal Machine Learning Applications