Improving Table Retrieval with Question Generation from Partial Tables

Hsing-Ping Liang; Che-Wei Chang; Yao-Chung Fan

arXiv:2508.06168·cs.IR·August 11, 2025

Improving Table Retrieval with Question Generation from Partial Tables

Hsing-Ping Liang, Che-Wei Chang, Yao-Chung Fan

PDF

Open Access 1 Video

TL;DR

This paper introduces QGpT, a method that uses large language models to generate synthetic questions from partial tables, improving table retrieval by better aligning table representations with user queries.

Contribution

The paper proposes a novel approach to enhance table retrieval by generating synthetic questions from partial tables to improve their embedding representations.

Findings

01

Significant improvement in retrieval performance across multiple benchmarks.

02

Effective enhancement for both dense and late-interaction retrievers.

03

No need to embed entire tables, reducing computational complexity.

Abstract

Recent advances in open-domain question answering over tables have widely adopted large language models (LLMs) under the Retriever-Reader architecture. Prior works have effectively leveraged LLMs to tackle the complex reasoning demands of the Reader component, such as text-to-text, text-to-SQL, and multi hop reasoning. In contrast, the Retriever component has primarily focused on optimizing the query representation-training retrievers to retrieve relevant tables based on questions, or to select keywords from questions for matching table segments. However, little attention has been given to enhancing how tables themselves are represented in embedding space to better align with questions. To address this, we propose QGpT (Question Generation from Partial Tables), a simple yet effective method that uses an LLM to generate synthetic questions based on small portions of a table. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Improving Table Retrieval with Question Generation from Partial Tables· underline

Taxonomy

TopicsData Quality and Management · Handwritten Text Recognition Techniques · Web Data Mining and Analysis